The market for search engines offering public transport connections is already quite saturated. In Poland, services such as Google Maps, jakdojade.pl, e-podróżnik.pl, and rozkladzik.pl have been operating for many years.

However, if you want to learn how to implement a much simpler version of such algorithms, this article is for you.

I will describe what the network model looks like, what to consider as input data, what output data should be expected, and how the algorithm calculating the routes works.

In the second part of this article, I will show you how to optimize this process knowing the geographical locations of bus stops.

## Prerequisites for Generating Public Transit Routes

First, let’s discuss the input data. This consists of data entered by the user, as well as bus timetable data alongside information of the geographical location of the routes. The data sets provided by the user can include:

- Initial bus stop
- Terminal bus stop
- Travel starting time
- Maximum time that the user is willing to spend on transferring between bus stops (as part of a change)

Bus stop geographical location data sets must consist of:

- Longitude
- Latitude
- ID of a bus stop
- Type of transit stop (tram, bus, tram-bus)

Timetable data must be composed of a variance list of each bus line (there may be many variants of one line, e.g. a variant taking into account a detour of certain streets).

Each variant of the journey contains its list of bus stops, with the bus stop information including:

- Name of the bus stop
- Corresponding street
- Bus stop’s ID corresponding to the one of the bar in the list of transit stops
- List of timetables for a given day (Weekdays, Saturday, and Sunday)
- A list of departure times from a given transit stop for each day (tram or bus)

Another issue is the mock-up of the transport network model. One of the easiest and most frequently used models presenting this network is a directed graph. In this model, bus stops act as vertices, and the transit network connections make up the edges of the graph.

For example, bus travel from transit stop A to transit stop B or pedestrian journeys. On-foot journeys, as the name implies, are those that can be reached on-foot within a short time, e.g. 5 minutes.

Below you can see an exemplary graphic displaying the vertices of such a graph superimposed on a map of Wrocław, Poland.

## Problem Specification

To illustrate the problem, I created an example graph of transit links:

The graph corresponds with a sample public transport timetable:

If the passenger intends to travel from transit stop 10 to transit stop 9, they can take three possible routes:

- The first route assumes a transfer between buses at
*transit stop**4*from the*B**lue**L**ine*to the*R**ed**L**ine*. - The second route assumes that the passenger will travel along the
*B**lue**L**ine*to*transit**stop**6*, where they will proceed on-foot to*transit**stop 5*, from where they will continue their journey along the*R**ed**L**ine*. - Route number three is one of the least ideal in terms of the number of stops. This route adopts a circular route, consisting of travel along
*Blue*and*Black Lines*, with a final interchange at*transit**stop 11*onto the*G**reen**L**ine*. Such a ‘roundabout’ route can be run, for example, through the city beltway and despite the greater number of transit stops along the route, it may prove faster than an alternative trip through city center traffic.

We can illustrate each route alongside bus arrival times at given stops. We can see that routes 1 and 2, despite being separate journeys, take the same amount of time. The ‘roundabout’ journey on route number 3 expends considerably more time.

To enable faster route finding, we can create an algorithm based on the generation of two-dimensional arrays.

The generation of arrays is a process that ties up resources, but it is a one-off operation. Simply searching for routes using the previously generated matrix allows you to find routes in a time acceptable for the user.

## Minimum Change Number Matrix

First, we need a two-dimensional matrix of the minimum number of transfers. For this example, we can call it the *Q matrix*. Each cell value Q [start, end] corresponds to the minimum number of transfers on the route that should be made to get from the Initial bus stop to the Terminal bus stop.

### Creating a Matrix

### Step 1.

The first step in obtaining the *Q matrix* is to create a square *T-Matrix*. The easiest way to implement it, would be as a two-dimensional array, whereby the size of the T Matrix is equal to the number of vertices in the graph. Rows and columns in this matrix label the specific vertices. Initially, the matrix consists only of zero.

### Step 2.

The second step is to check whether you can travel from the *start* bus stop to the *terminal *bus stop. If the direct journey does not occur, then a value 0 is inserted into the cell *T* [*start, stop*]. When such a journey does occur, then a value 1 is inserted into the cell *T* [*start, stop*].

*Example:*

From *transit stop 1* you can travel to *transit **stops 2, 3, 12, 11 *using the *B**lack **L**ine* and to *transit **stops 7, 4, 5, 9* using the *R**ed **L**ine*.

Therefore, the first row demonstrates the following:

*T [1,2] = 1 **– Journey **along the B**lack **L**ine*

*T [1,3] = 1 – **Journey along the B**lack **L**ine*

*T [1.4] = 1 – **Journey along the Red* *L**ine*

*T [1.5] = 1 – **Journey along the Red* *L**ine*

*T [1.7] = 1 – **Journey along the Red* *L**ine*

*T [1.9] = 1 – **Journey along the Red* *L**ine*

*T [1.11] = 1 – **Journey along the Black* *L**ine*

*T [1.12] = 1 – **Journey along the Black* *L**ine*

The remaining values in this row will adopt the value 0 as there is no direct connection between these transit stops. As such, the entire T-Matrix looks as follows:

During the designation of the T-Matrix, we didn’t take the pedestrian links into consideration. However, they can be included in this algorithm by treating pedestrian links between bus stops like bus routes. So, for on-foot realized bus journeys, in the cell T [start, stop] we simply need to insert value 1.

Additionally, the user can set the maximum amount of time they wish to spend on-foot. Therefore, it is necessary to create multiple T-Matrices. One such matrix as shown in *Table **3*, for users who do not wish to travel on foot, in addition to related tables of the T-Matrix , e.g. *T5min* – for a maximum of 5 minutes–walking, *T10min* for a maximum of 10 minutes–walking.

Assuming walking times between *transit **stop**s** 4, 5,* and* 6* is the same, each taking 4 minutes, you can create a *T5min* table. It will be similar to the *T table*, however values in the following cells will change from zero to one:

*T*[*4,5*]* = 1*

*T*[*5,4*]* = 1*

*T*[*4,6*]* = 1*

*T*[*6,4*]* = 1*

The table below illustrates the generated T-Matrix including journeys made on-foot. Values that differ from the value in Table 4 have been underlined.

### Step 3.

Now we’re able to generate the Q-Matrix (based on T-Matrix).

The first step in calculating the *Q **M**atrix* is to calculate based on each of the exponents (powers) *p* for the *T Matrix* in the range [*p = 1* … *maxt**]*, with *maxt* representing the maximum number of possible bus interchanges.

After calculating the *T Matrix* to each of the exponents *p*, its subsequent elements *T* [i, j] are then verified. If this value is greater than zero, then it is entered into the *Q Matrix* in the corresponding element *Q* [*i**, j*]. The new value is entered into the element *Q* [i, j] only if the given element *Q* [i, j] is currently equal to zero. The only exception is the diagonal of the *Q Matrix*, as all elements in it remain zero.

The generated range is the representation of the matrix of the smallest number of transfers. In every cell *Q* [*start, stop*] there is information concerning the minimum number of transfers with which you can travel from the initial bus stop to the terminal bus stop.

Table 5 presents the matrix of the minimum number of bus transfers, with a maximum of one change (in a maximum of two bus journeys). For example, in the first row, it can be observed that a combination of *transit stops 1* and *transit stop 6* was established because, from the initial bus stop you can travel to the terminal bus stop along the *Red Line, *then changing to the *Blue Line*.

In the second row, three new values were created (in relation to the T-Matrix). As such, from *transit **stop 2* you are granted one change to *transit **stops* (*5, 6,* and* 9*). This is possible because:

*Bus Stops 2 -> 9 – travel **along the **B**lack **L**ine, chang**ing** to **the G**reen **L**ine*

*Bus Stops 2 -> 6 – **travel **along the B**lack **L**ine, chang**ing** to **the G**reen **L**ine*

*Bus Stops 2 -> 5 – **travel **along the B**lack **L**ine, chang**ing** to **the G**reen **L**ine*

In *Table **6*, a more complex matrix is presented, permitting up to three bus changes. It sets out the majority of possible combinations for the graph used as in the example. E.g., in cell [*3,2*] there is a value of *4*. It means that by using all 4 possible *Lines* (with three transfers), you can travel from *transit **stop 3* to *transit **stop 2*. This can be achieved by traveling along the *Black Line*, then *G**reen*, *B**lue*, before returning to the *B**lack** Line*.

## Generating Bus Routes from a Calculated Matrix

Bus routes are created by finding all (or a sufficient number of) combinations of transit stops where a passenger needs to change between bus lines in order to get from the initial bus stop to the terminal bus stop. This is done by recursively searching the *Q* table containing the matrix of the minimum number of transfers. These combinations are then supplemented with missing transit stops to create complete ro