The market for search engines offering public transport connections is already quite saturated. In Poland, services such as Google Maps, jakdojade.pl, e-podróżnik.pl, and rozkladzik.pl have been operating for many years.

However, if you want to learn how to implement a much simpler version of such algorithms, this article is for you.

I will describe what the network model looks like, what to consider as input data, what output data should be expected, and how the algorithm calculating the routes works.

In the second part of this article, I will show you how to optimize this process knowing the geographical locations of bus stops.

## Prerequisites for Generating Public Transit Routes

First, let’s discuss the input data. This consists of data entered by the user, as well as bus timetable data alongside information of the geographical location of the routes. The data sets provided by the user can include:

- Initial bus stop
- Terminal bus stop
- Travel starting time
- Maximum time that the user is willing to spend on transferring between bus stops (as part of a change)

Bus stop geographical location data sets must consist of:

- Longitude
- Latitude
- ID of a bus stop
- Type of transit stop (tram, bus, tram-bus)

Timetable data must be composed of a variance list of each bus line (there may be many variants of one line, e.g. a variant taking into account a detour of certain streets).

Each variant of the journey contains its list of bus stops, with the bus stop information including:

- Name of the bus stop
- Corresponding street
- Bus stop’s ID corresponding to the one of the bar in the list of transit stops
- List of timetables for a given day (Weekdays, Saturday, and Sunday)
- A list of departure times from a given transit stop for each day (tram or bus)

Another issue is the mock-up of the transport network model. One of the easiest and most frequently used models presenting this network is a directed graph. In this model, bus stops act as vertices, and the transit network connections make up the edges of the graph.

For example, bus travel from transit stop A to transit stop B or pedestrian journeys. On-foot journeys, as the name implies, are those that can be reached on-foot within a short time, e.g. 5 minutes.

Below you can see an exemplary graphic displaying the vertices of such a graph superimposed on a map of Wrocław, Poland.

## Problem Specification

To illustrate the problem, I created an example graph of transit links:

The graph corresponds with a sample public transport timetable:

If the passenger intends to travel from transit stop 10 to transit stop 9, they can take three possible routes:

- The first route assumes a transfer between buses at
*transit stop**4*from the*B**lue**L**ine*to the*R**ed**L**ine*. - The second route assumes that the passenger will travel along the
*B**lue**L**ine*to*transit**stop**6*, where they will proceed on-foot to*transit**stop 5*, from where they will continue their journey along the*R**ed**L**ine*. - Route number three is one of the least ideal in terms of the number of stops. This route adopts a circular route, consisting of travel along
*Blue*and*Black Lines*, with a final interchange at*transit**stop 11*onto the*G**reen**L**ine*. Such a ‘roundabout’ route can be run, for example, through the city beltway and despite the greater number of transit stops along the route, it may prove faster than an alternative trip through city center traffic.

We can illustrate each route alongside bus arrival times at given stops. We can see that routes 1 and 2, despite being separate journeys, take the same amount of time. The ‘roundabout’ journey on route number 3 expends considerably more time.

To enable faster route finding, we can create an algorithm based on the generation of two-dimensional arrays.

The generation of arrays is a process that ties up resources, but it is a one-off operation. Simply searching for routes using the previously generated matrix allows you to find routes in a time acceptable for the user.

## Minimum Change Number Matrix

First, we need a two-dimensional matrix of the minimum number of transfers. For this example, we can call it the *Q matrix*. Each cell value Q [start, end] corresponds to the minimum number of transfers on the route that should be made to get from the Initial bus stop to the Terminal bus stop.

### Creating a Matrix

### Step 1.

The first step in obtaining the *Q matrix* is to create a square *T-Matrix*. The easiest way to implement it, would be as a two-dimensional array, whereby the size of the T Matrix is equal to the number of vertices in the graph. Rows and columns in this matrix label the specific vertices. Initially, the matrix consists only of zero.

### Step 2.

The second step is to check whether you can travel from the *start* bus stop to the *terminal *bus stop. If the direct journey does not occur, then a value 0 is inserted into the cell *T* [*start, stop*]. When such a journey does occur, then a value 1 is inserted into the cell *T* [*start, stop*].

*Example:*

From *transit stop 1* you can travel to *transit **stops 2, 3, 12, 11 *using the *B**lack **L**ine* and to *transit **stops 7, 4, 5, 9* using the *R**ed **L**ine*.

Therefore, the first row demonstrates the following:

*T [1,2] = 1 **– Journey **along the B**lack **L**ine*

*T [1,3] = 1 – **Journey along the B**lack **L**ine*

*T [1.4] = 1 – **Journey along the Red* *L**ine*

*T [1.5] = 1 – **Journey along the Red* *L**ine*

*T [1.7] = 1 – **Journey along the Red* *L**ine*

*T [1.9] = 1 – **Journey along the Red* *L**ine*

*T [1.11] = 1 – **Journey along the Black* *L**ine*

*T [1.12] = 1 – **Journey along the Black* *L**ine*

The remaining values in this row will adopt the value 0 as there is no direct connection between these transit stops. As such, the entire T-Matrix looks as follows:

During the designation of the T-Matrix, we didn’t take the pedestrian links into consideration. However, they can be included in this algorithm by treating pedestrian links between bus stops like bus routes. So, for on-foot realized bus journeys, in the cell T [start, stop] we simply need to insert value 1.

Additionally, the user can set the maximum amount of time they wish to spend on-foot. Therefore, it is necessary to create multiple T-Matrices. One such matrix as shown in *Table **3*, for users who do not wish to travel on foot, in addition to related tables of the T-Matrix , e.g. *T5min* – for a maximum of 5 minutes–walking, *T10min* for a maximum of 10 minutes–walking.

Assuming walking times between *transit **stop**s** 4, 5,* and* 6* is the same, each taking 4 minutes, you can create a *T5min* table. It will be similar to the *T table*, however values in the following cells will change from zero to one:

*T*[*4,5*]* = 1*

*T*[*5,4*]* = 1*

*T*[*4,6*]* = 1*

*T*[*6,4*]* = 1*

The table below illustrates the generated T-Matrix including journeys made on-foot. Values that differ from the value in Table 4 have been underlined.

### Step 3.

Now we’re able to generate the Q-Matrix (based on T-Matrix).

The first step in calculating the *Q **M**atrix* is to calculate based on each of the exponents (powers) *p* for the *T Matrix* in the range [*p = 1* … *maxt**]*, with *maxt* representing the maximum number of possible bus interchanges.

After calculating the *T Matrix* to each of the exponents *p*, its subsequent elements *T* [i, j] are then verified. If this value is greater than zero, then it is entered into the *Q Matrix* in the corresponding element *Q* [*i**, j*]. The new value is entered into the element *Q* [i, j] only if the given element *Q* [i, j] is currently equal to zero. The only exception is the diagonal of the *Q Matrix*, as all elements in it remain zero.

The generated range is the representation of the matrix of the smallest number of transfers. In every cell *Q* [*start, stop*] there is information concerning the minimum number of transfers with which you can travel from the initial bus stop to the terminal bus stop.

Table 5 presents the matrix of the minimum number of bus transfers, with a maximum of one change (in a maximum of two bus journeys). For example, in the first row, it can be observed that a combination of *transit stops 1* and *transit stop 6* was established because, from the initial bus stop you can travel to the terminal bus stop along the *Red Line, *then changing to the *Blue Line*.

In the second row, three new values were created (in relation to the T-Matrix). As such, from *transit **stop 2* you are granted one change to *transit **stops* (*5, 6,* and* 9*). This is possible because:

*Bus Stops 2 -> 9 – travel **along the **B**lack **L**ine, chang**ing** to **the G**reen **L**ine*

*Bus Stops 2 -> 6 – **travel **along the B**lack **L**ine, chang**ing** to **the G**reen **L**ine*

*Bus Stops 2 -> 5 – **travel **along the B**lack **L**ine, chang**ing** to **the G**reen **L**ine*

In *Table **6*, a more complex matrix is presented, permitting up to three bus changes. It sets out the majority of possible combinations for the graph used as in the example. E.g., in cell [*3,2*] there is a value of *4*. It means that by using all 4 possible *Lines* (with three transfers), you can travel from *transit **stop 3* to *transit **stop 2*. This can be achieved by traveling along the *Black Line*, then *G**reen*, *B**lue*, before returning to the *B**lack** Line*.

## Generating Bus Routes from a Calculated Matrix

Bus routes are created by finding all (or a sufficient number of) combinations of transit stops where a passenger needs to change between bus lines in order to get from the initial bus stop to the terminal bus stop. This is done by recursively searching the *Q* table containing the matrix of the minimum number of transfers. These combinations are then supplemented with missing transit stops to create complete routes.

The first step in this process is to check if there is any connection between initial and terminal transit stops. For such a connection to exist, the value of Q [start, stop] must be greater than zero. For example, in *Table *7; for the initial *transit **stop 10* and terminal *transit **stop* *9*, by checking the value of *Q* [10, 9], in the case of the example matrix presented below; the value presented is 2. This kind of connection exists, whereby you should travel along at least two lines in order to get from the initial stop to the terminal stop.

If the connection exists, then the next step is performed, i.e. finding a combination of stops on which transfers will be made on the route. This search starts from the final stop, in this case from *transit **stop 9*. You must recognize all interchanges leading back to the initial stop [5]. This can be done through a few steps:

- For final stops (in the first iteration, this can only be
*transit**stop 9*) you should find all the transit stops from which you can reach the final stop. This can be done by searching the end stop column and search for transit stops with a value of 1 (direct connection). - The retrieved stops are added to the route collection. If they are also a start-stop, then the algorithm will complete the search.
- The retrieved stops are treated as final stops, and the first step will be performed for them.

The operation of the algorithm can be viewed on the example Q Table, on stops numbered *10* and *9*.

- For the first step of the algorithm, the final transit stop
*number 9*is found. The corresponding column in the Q Table is searched for all rows with the value equal to 1. These are*L**ines*:*1, 4, 5, 7,*and*11* - Retrieved bus stops are added to the route collection

As none of the routes above contain a bus stop number 10, the search will therefore continue.

- All retrieved transit stops (
*1,4,5,7,11*) will return to Step 1. For each of these, their Q Table column is searched for direct connections (value 1)

Recursion continues until the route is connected to the initial transit stop. In practice however, finding all routes by searching ranges recursively can be expensive. Therefore, additional restrictions are used, such as searching until a maximum number of connections or time constraints are reached.

For the example used, you can generate 136 results. *Figure **4.3* below contains an example of some of them:

The above routes contain only interchanges. In order to receive complete routes, you must complete them with the rest of the transit stops. Since these stops can be connected by more than one Line, different route variants can be created. *Figure **6* below shows an example of completing the route for the rest of the transit stops:

## Results

To determine the quality of the routes generated, I had created a test application. You can view some examples of route generation in GIF provided below.

Additionally, as I wished to verify if the results are correct, the results were then compared with those from the *Google Maps* service. The reason behind this, is because Google boasts one of the top search engines for mass transit connections. The point of this comparison was only to check if simple algorithms can bear results, even close to those produced by Google Maps, and if they are correct.

The Drawings below present routes generated by Google Maps and the test application. As you can see, routes of the described application are very similar. They differ only in the place of changing from one line to the other. Furthermore, travel times are also very similar.

**Note:** this is simply a check for route correctness. Comparing simple algorithms to Google Maps doesn’t make sense.

### Test Case A

Starting Point: Stanki, 50-001 Wrocław

Final Ptop: Sołtysowicka, 50-001 Wrocław

Starting Time: Monday 8:00; 1st, of March 2018

### Test Case B

Starting Point: Piramowicza, 50-001 Wrocław

Final stop: Pisarzowice – Odrzańska, 55-330

Route start time: Monday 8:00 1 March 2018

### Test Case C

Starting Point: Kadłubka, 50-001 Wrocław

Final Stop: Bacciarellego, 50-001 Wrocław

Route Start Time: Monday 8:00 1 March 2018

**Literature:**

- Koszelew J, Ostrowski K,
*Evolutionary Algorithm with Geographic Heuristics for Urban Public Transportation*, Springer-Verlag, Berlin, 2012 - Koszelew J, Ostrowski K,
*Approximation Method to Route Generation in Public Transportation**Network*, 2008 - Hartley J, Wu Q,
*Accommodating User Preferences in the Optimization of Public Transport Travel*. 2004 - Koszelew J,
*An Evolutionary Algorithm for the Urban Public Transportation*. Springer, Heidelberg, 2011 - Koszelew J, Ostrowski K,
*A Genetic Algorithm with Multiple Mutation which Solves Orienteering Problem in Large Networks*, 2013 - Wellman M ., Ford M, Larson K,
*Path planning under time-dependent uncertainty, Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence*, 1995 - Koszelew J,
*The Theoretical Framework of Optimization of Public Transport Travel*, 2007