Creating a Web-Based Schedule Generator

Layton is my Homeboy · 03-06-2012, 09:18 PM

I am in the process of developing a web-based schedule generator for OOTP. This project is not yet complete. In fact, it will likely take me some time to finish it.

Creating good baseball schedules for an MLB-type of league is a notoriously difficult task. Several excellent papers have been written (some available online, others not) which describe various algorithmic solutions for creating desirable MLB schedules. However, despite a large amount of academic and professional attention being paid to this problem it remains difficult to solve in a way that is both efficient (in terms of processing) and optimal (in terms of maximizing desirable features in the resulting schedule).

The task of creating a schedule generator for OOTP is even harder. OOTP supports a wide variety of league formats (division structure, number of teams, etc.), schedule lengths (number of games), schedule formats (interleague/no interleague), game distributions (balanced/unbalanced), etc. As a result, a generalized schedule generator for OOTP needs to be flexible enough to handle a large number of use cases. Algorithms that work well for a modern MLB setup need to be generalized to work for very different types of leagues and schedules.

The purpose of this thread is to gather some advice and feedback from the OOTP community that will help me to work through some of the programming and design challenges that I am facing in my effort to develop a schedule generator. I appreciate any help that anyone is willing to offer, but I am particularly interested in getting some input from the following categories of people:

Those who have experience making OOTP schedules by hand.
Those who are knowledgeable about the paramaters and constraints that are used to create actual schedules for the MLB and other leagues (both now and historically).
Those with computer programing skills who can offer some advice in terms of the algorithms used to implement the solution design.

In subsequent posts, I will get much more specific about the type of information and assistance that I need. However, before I come to that, there is some background information that will be useful to this discussion (which will be posted shortly).

For now, I strongly encourage anyone who is interested in this thread to read GMO's Making Baseball Schedules blog, in its entirety (there are only 6 posts on the blog, and the whole thing can be read in an hour). It is an extremely well-written description of some of the steps that GMO takes when creating a schedule for OOTP. It will serve as useful background reading for the purposes of this thread.

Layton is my Homeboy · 03-06-2012, 09:57 PM

What makes a good schedule? This is the first question that I want to ask the community.

OOTP will generate perfectly fine schedules all on its own. So, why bother going to all the effort of creating a schedule generator outside of the game? The basic reason is that the schedules OOTP generates are not always completely satisfactory.

So, part of the goal of our schedule generator will be to emphasize traits that make a schedule desirable and to de-emphasize traits that a make a schedule undesirable.

So, what makes a good schedule? I think that question has to be answered using two lists. The first is a list of absolute requirements. A schedule that does not meet these requirements cannot be used. The second is a list of desirable traits. A good schedule is one that conforms as much as possible with these desirable traits.

As part of our solution design, we will be implementing combinatorial optimization algorithms that will ensure that we are likely to get a schedule that is both good and feasible.

I am interested to get some thoughts from the many great OOTP schedule-makers as to what should be included on these two lists. In addition, I would like to get a sense of which items in the "desirable" list are the most important. When we apply our optimization functions, we will assign "scores" to possible schedules based on the degree to which they conform the desirable traits. We can therefore assign weights to each of them relative to their importance.

To get the ball rolling, here are a few items that I consider important.

Requirements (Must-Haves)

Teams play the same number of games as each other.
Each team plays the same number of home and away games (with the consequence that each team must play an even number of games throughout the season).
Each team plays all and only the opponents they are supposed to play (as defined by the league structure and schedule parameters).
Each team plays an appropriate number of games against each opponent (according to the league structure and schedule parameters).

Desirable Traits (Good-to-Haves)

If possible, each teams plays an equal number of home and away games against each opponent.
Series are arranged so that they regularly end on a Sunday.
Teams play roughly the same number of home games on weekends.
No team plays too many consecutive series away from home (ideally, three series away from home is the maximum).
Teams do not play consecutive series against the same opponent (ie semi-repeaters).
Travel time is minimized.*
Game times are approrpriate to the era and games scheduled on the same day do not all start at the same time.**
Teams do not all take an off-day on the same day (ie there is at least one game for every day in the schedule, other than during an all-star break).***
Teams do not have too many days in between days off.
The end of the season features more divisional games.
Opening day features mostly divisional games.
Interleague games are avoided late in the season.****

Comments

A few of these items are marked with asterisks. These items merit some further comment.

*In a real-world scheduling application, we would attempt to minimize travel time by figuring out the actual, physical distance between each of the teams in the league. When creating a schedule for OOTP, it isn't possible to get to that level of precision. When creating a schedule in OOTP's XML schedule format, each team is represented by a number. The specific team that is assigned to a particular number rotates year by year. So, when making the schedule, you don't necessarily know which specific team is "team 1" will be. You only know which division team 1 is in. So, for the purposes of minimizing travel time in OOTP, the best we can do is to try to avoid playing consecutive away series against opponents from different divisions (on the assumption that the divisions have geographic significance).

**As LGO pointed out here, there is a lack of historical data regarding specific game start times. Retrosheet has some data available that gives a simple breakdown of day vs. night games year-by-year, but the data are not more refined than that.

***This is mainly an aesthetic preference, but I think that it is probably the biggest complaint that most people have about the schedules that OOTP generates. An OOTP-generated schedule typically has several days throughout the year where every team takes an off day. This is not realistic and I personally find it unsatisfying. I prefer to see 1/2 games back in the standings, and teams with games in hand against their division rivals. I think that it makes pennant races more interesting and dynamic. It is also more true to life.

****The current MLB schedule format puts most interleague games in the first half of the season. When the Astros move to the AL West, this will change, as there will have to be at least one interleague series at all times during the season. However, unless we are required to do this (as a result of having two subleagues that both have an odd number of teams), we will avoid late-season interleague play as much as possible.

Layton is my Homeboy · 03-06-2012, 11:33 PM

As I step through my thoughts on a solution to the scheduling problem, it will be helpful to define some terminology. This way, we will all be on the same page when discussing certain topics.

Schedule parameters are the options and settings defined by the user that dictate a number of requirements for the schedule. Parameters include the number of teams in the league, the league structure, the number of games played by each team, whether the schedule includes interleague play, whether the schedule is balanced or unbalanced, the typical series length, etc.

A team means a team in the league. When we are creating schedules, each team is represented by an integer.

A set is ordered if the arrangement of the members of the set in relation to one another has some significance. A set is unordered if the arrangement of the members of set does not have any significance.

A pairing is a unordered pair of teams. For example, the pairing of teams 1 and 2 is represented by the set {1,2}. Because a pairing is unordered, the pairing {1,2} is equivalent to the pairing {2,1}. In effect, a pairing represents a series, before we determine which team is the home team and which is the away team, or the number of games in the series. A pairing is feasible if the teams in the pairing are supposed to play each other during the season.

A series is a pairing of teams, where one of the teams is designated as the "home" team and the other is designated as the "away" team. A series also has an additional value: the number of games in the series. Series are the basic building blocks a schedule.

A series (or a pairing) can be a divisional series, a league series, or an interleague series. A divisional series is one in which both teams are in the same division. A league series is one in which both teams are in the same subleague, but they are not in the same division. An interleague series is one in which the two teams are not in the same subleague.

The games matrix is a notional table that represents the number of games that each team is supposed to play against each other team during the season (which is determined from the league parameters). Similarly, the series matrix is a notional table that represents the number of series that each team is supposed to play against each other team during the season.

A block is a set of pairings, such that every team in the league appears exactly once in one of the pairings in the block. For example, if there are four teams in the league, then one possible block would be {{1,2},{3,4}}. A block is unfinished if it consists of pairings. A block is finished if it consists of series (that is, pairings where the home and away values and a series length has been assigned). A feasible block is one in which all of the pairings are feasible.

There is one special block, which is the all-star block. If the league parameters call for an all-star break and game to be scheduled, the special all-star block will be inserted into the schedule in an appropriate place.

A stack (building on the block metaphor) is a set of blocks. A feasible stack is one in which the set of all of the pairings/series in all of the blocks matches up with the the series matrix. In other words, a feasible stack basically represents a set of blocks that will result in a schedule that conforms the to league parameters. An unordered stack is one in which the order of the blocks has not yet been determined. An ordered stack is one in which the blocks have been arranged into a particular order (generally, after an optimization process).

Layton is my Homeboy · 03-07-2012, 02:11 PM

So, with some of that background out of the way, it's time to start thinking about the basic approach to building a schedule algorithmically.

There are a number of different approaches that could be used. Personally, my thinking is heavily informed by GMO's Making Baseball Schedules blog. What I have come up with is an adaptation of the basic approach described on that blog.

What I like about GMO's approach is that it breaks down a complex problem into a series of simpler problems. The solutions to those simpler problems become building blocks that allow you to solve more complex problems.

Over the past several weeks of trial-and-error programming, the major steps that I have identified are as follows:

Step 0: Gather schedule parameter's based on the user's input.

Step 1: Based on the schedule parameters, figure out how many games each team should play against each opponent (that is, create the games matrix and the series matrix).

Step 2: Create a menu of every feasible block using the pairings defined in the games/series matrix. In other words, find every feasible arrangement of games such that every team is playing. In doing so, assign a desirability score to each block in the menu.

Step 3: Use the menu of feasible blocks to construct a feasible unordered stack (that is, an arrangement of blocks such that the use up all and only the series in the series matrix) that optimizes the selection of desirable blocks.

Step 4: Assign home and away values and series lengths to the pairings in the blocks in the stack, such that each team plays an appropriate number of home and away games against each opponent.

Step 5: Arrange the blocks in the stack in a way that optimizes desirable traits.

Step 6: Apply finishing touches such as game times, series offsets, etc.

In subsequent posts, I will describe the basic approaches that I am using at each step in this process (not necessarily in order) with a view to soliciting some advice and feedback to see if the approaches that I am using can be improved upon.

Layton is my Homeboy · 03-07-2012, 10:43 PM

Now that we have a general approach, we can start to get a bit more technical.

For this discussion, I am going to start at Step 2 (creating a menu of blocks). I will come back to step 1 later on in the thread (as I will discuss, I actually think that step 1 is the most difficult step of this solution).

Assumptions

Since we are skipping step 1, we will make a few assumptions about the data that we theoretically generated in that step.

We will assume that step 1 generated a full pairings/series matrix that tells us which opponents each team is supposed to play throughout the year. It may or may not be the case all teams play each other during the season. For example, if we have just one division in one subleague, then all teams in that division will play each other. By contrast, if we have an MLB-type of setup, then not every team will play each other due to the way that interleague play works.

We will also assume, based on the schedule parameters, that we know which division and subleague each team is in. As a result, whenever we put two teams together into a pairing, we will be able to determine whether that pairing represents a divisional series, a league series, or an interleague series.

Creating a Menu of Blocks

At this stage of the solution, we are trying to create a complete menu of feasible blocks that we can later use to construct the schedule. As a refresher, a block is defined as follows:

Quote:

Originally Posted by Layton is my Homeboy

A block is a set of pairings, such that every team in the league appears exactly once in one of the pairings in the block. For example, if there are four teams in the league, then one possible block would be {{1,2},{3,4}}. A block is unfinished if it consists of pairings. A block is finished if it consists of series (that is, pairings where the home and away values and a series length has been assigned). A feasible block is one in which all of the pairings are feasible.

So, our goal is to create a menu of every feasible block, based on the schedule parameters. Later on, we will select blocks from this menu to construct a feasible stack. The blocks that we will create at this stage may be used more than once in the stack.

At this stage, we are only dealing with unordered pairings. In other words, we don't care at this stage which teams are the "home" teams and which are the "away" teams. We just care about creating blocks satisfy two conditions: (1) every team in the league appears once and only once within the block; and (2) each pairing consists of two teams that are supposed to play each other (according to the schedule parameters).

In addition to creating a comprehensive menu of blocks, we are going to assign a desirability score to each block. Later, when we are selecting blocks to put together in a stack, we will try to select the most desirable blocks whenever possible.

So, what is a "desirable" block? A block consists of pairings, some of which might be divisional pairings, some of which might be league pairings, and some of which might be interleague pairings. In my view, a desirable block is one that is as homogeneous as possible. In other words, we want all of the pairings in the block (or as many of them as possible) to share the same type.

The reason that this is desirable is that we want to simplify the schedule-making process as much as possible. Having mixed game types in every block makes scheduling harder, because it is more difficult to select a bunch of mixed type blocks that add up to every team playing the correct number of games.

Homogeneous blocks will also help us with optimization later on in the solution. For example, we want to have mostly divisional games late in the season. Having a block that consists of mostly divisional games will help us accomplish this. Similarly, we don't want too many interleague games in the late season. Having an all-interleague block somewhere in the early season will help us with that.

Of course, it won't always be possible for every block to have only a single type of game. In fact, with Houston moving to the AL West next season, the real MLB schedule will be moving to a format in which there will always be at least one interleague game throughout every block in the schedule. So, instead of limiting ourselves just to single-game-type blocks, we will simply prefer blocks that are mainly a single type.

Desirability is based on homogeneity. So, to calculate the desirability score, we are going to add up the total number of divisional, league and interleague games in each block. Desirability is equal to the maximum value out of those three totals.

If there are 10 pairings in a block, then an optimally desirable block is one that consists of 10 divisional pairings or 10 league pairings or 10 interleague pairings. Any of those blocks will have a desirability score of 10. A block that has 7 divisional pairings, 2 league pairings, and 1 interleague pairing has a desirability score of 7. Given a choice between the '10' block and the '7' block, we will add the 10 block to the stack.

We can create the menu of blocks by looping through all of the available pairings for each 'slot' in the block. Because the order of the pairings within the block does not matter, we will filter out equivalent permutations (for example {{1,2},{3,4}} is equivalent to {{3,4},{1,2}}. This will limit our search space and save on processing. We will also filter out blocks that are not feasible (for example, blocks where team 1 appears in two different pairings).

Whenever we identify a feasible block, we will apply a desirability function and store the result in the block's data structure so that we can use it later.

Layton is my Homeboy · 03-07-2012, 11:06 PM

STEP 3: Create a Feasible Stack of Blocks

At this stage in the process, our goal is to select from our menu of blocks to create a feasible unordered stack. Recall the definitions of "block" and "stack":

Quote:

Originally Posted by Layton is my Homeboy

A block is a set of pairings, such that every team in the league appears exactly once in one of the pairings in the block. For example, if there are four teams in the league, then one possible block would be {{1,2},{3,4}}. A block is unfinished if it consists of pairings. A block is finished if it consists of series (that is, pairings where the home and away values and a series length has been assigned). A feasible block is one in which all of the pairings are feasible.

Quote:

Originally Posted by Layton is my Homeboy

A stack (building on the block metaphor) is a set of blocks. A feasible stack is one in which the set of all of the pairings/series in all of the blocks matches up with the the series matrix. In other words, a feasible stack basically represents a set of blocks that will result in a schedule that conforms the to league parameters. An unordered stack is one in which the order of the blocks has not yet been determined. An ordered stack is one in which the blocks have been arranged into a particular order (generally, after an optimization process).

In effect, a stack gets us very close to a final schedule for the league, except that we haven't yet figured out the best order in which to arrange the blocks in the stack and we haven't yet assigned home and away values to the pairings in the blocks in the stack.

When constructing the stack, there are a few things that we want to keep in mind. First, we only want a feasible stack. A feasiable stack is one that results in each team playing the correct number of series against each opponent, as determined in Step 1. So, if Team A is supposed to play 4 series against Team B throughout the season, the stack must result in exactly 4 series being played between those teams. Otherwise, the stack is not feasible.

The second thing to keep in mind is that we want, wherever possible, to use the most desirable blocks, as determined in step 2. In most use cases, it won't be possible to create a feasible stack completely out of optimal blocks. However, we want to optimize the stack by selecting the most desirable blocks wherever possible.

The third thing to keep in mind is that we want to avoid using the same block twice. In general, there is no reason that we can't use the same block more than once. However, we want to avoid doing that as much as possible. There are two reasons for this. The first is technical: using multiples of the same block will tend to result in more repeaters and semi-repeaters, which we want to avoid. The second reason is aesthetic: it simply feels more natural and organic to have a more diverse block selection.

In effect, what we are trying to do at this stage is an exercise in combinatorial oprimization. We are trying to arrange blocks together into a stack in an optimal way based on the desirability of the blocks in the stack.

The difficulty here is that the search space is very large. It would take too much memory and processing time to consider every possible combination of blocks. So, to save time, we need a shortcut way of finding the optimal stack.

To solve this problem, I have decided to apply a simple branch and bound algorithm.

The basic premise of branch and bound is that we will explore the search space by gradually expanding a solution tree. As we do so, will prune branches on the tree that either will result in a solution that either is not feasible or not optimal. This limits the search space so that we don't have to consider every possible solution.

Scoring Function

In order to solve any optimization problem, there has to be a value that we are trying to optimize. In this case, we are going to assign a score to each stack.

To begin with, each stack's "score" will be the sum of the desirability ratings of the blocks in the stack. So, when we optimize the stack, we will be looking for a stack that uses desirable blocks.

However, our scoring function needs to be modified a bit. If we just rely on the desirability of the blocks, there is a risk that we will get repeating blocks - something that we want to avoid (if possible). As a result, we will create an arbitrary penalty for using repeating blocks.

Bounding Function

We will use a two-pronged bounding function for our search tree. Nodes that fail either bounding condition are pruned from the tree. Nodes that pass both bounding conditions remain and can be branched in the next iteration.

The first bounding condition relates to the feasibility of the stack. If the blocks assigned to the stack result in any two teams playing each other too many times, then the stack is not feasible and the node is pruned.

The second bounding condition relates to the scoring function. If the node's score is lower than the score of the current incumbent, the node is pruned. If the node is not yet a complete candidate, then we use the theoretical maximum desirability score for any unassigned blocks, and compare that against the current incumbent. The maximum block score is equal to t/2, where t = the number of teams in the league.

Node Selection Policy

We will be using a global-best node selection policy. In other words, whenever we are looking for a node to expand, we will always look for the highest scoring node, regardless of where it appears in the tree.

Layton is my Homeboy · 03-08-2012, 09:12 PM

I am going to skip step 4, as that step is very easy to do. In that step, we assign home and away values and series lengths to the pairings, transforming them into proper series.

STEP 5: Optimize Block Order

At this stage, we have an unordered stack full of blocks. Each block consists of series. Each series consists of a home team, an away team, and a series length. We now have all the information we need to sort the blocks into an order that results in a desirable schedule.

The goal at this stage in the process is to determine an optimal or near-optimal arrangement of blocks. An optimal arrangement is one that maximizes desirable schedule features and minimizes undesirable schedule features. Some of these features are described earlier in this thread. However, I will repeat some of them here because they will inform how we approach this task.

Desirable Features:

Divisional games late in the season.
Divisional games on opening day.
Teams play roughly the same number of weekend home series.
Series ending on a Sunday.

Undesirable Features:

Repeaters and semi-repeaters (that is, back-to-back series between the same two teams).
Excessively long roadtrips.
Interleague games late in the season.
Too much travel (defined for the purposes of this exercise as playing consecutive away series against teams in different divisions).
Too much time between days off.

In effect, we are going to compare possible arrangements of blocks by assigning each arrangement a score. Arrangements will get a positive score if they feature desirable traits and a negative score if they feature undesirable traits.

To find an optimal arrangement, we could just try out every possible arrangement, figure out how desirable it is, and then use the most desirable one (ie a brute force alogorithm). Unfortunately, this is not a viable solution as it will take up far too much memory and processing time.

For example, let's suppose that we have 52 blocks (that is what a modern MLB-type schedule uses - a 26-week schedule times 2 block per week). There are 52! (that is 52 x 51 x 50 x 49 etc.) different orders in which we can arrange those blocks, or about 8.06581752 × 10 ^ 67. Obviously, this is not going to work.

So, we have two basic options. The first option is to use an alogrithm that limits the search space while still guaranteeing us the optimal block order (for example, the branch-and-bound technique that we used in Step 3). The second option is to use an algorithm that drastically limits the search space, but only gives us a near-optimal result.

In this case, I am inclined to use a technique that drastically cuts down the spearch space, at the expense of not being guaranteed the optimal solution. This particular problem is not well-suited to a branch-and-bound algorithm, because it is difficult to come up with an effective bounding function. A branch-and-bound solution without a good bounding function won't be very effecient, because it will fail to "prune" bad candidates and therefore it won't limit the search space effectively.

My inclination in this situation is to use a technique that will be much more efficient, while still giving us a near-optimal solution. In particular, I am going to use a genetic alorithm for this stage of the process.

The basic idea behind genetic algorithms is to mimic the process of biological evolution. So, we will start with a random population of candidate solutions. We will assign scores to each solution, based on the desirability criteria that we defined earlier. Then, we will drop the least "fit" (ie desirable) solutions from the field. The remaining solutions will become the "parents" for the next generation of solutions. For each parent, we will create a set of "child" solutions, each of which is very similar to its parent, but with a slight variance. Then, we will repeat the process of discaring the least fit solutions, and move on to the next generation.

We can repeat this process for a fixed number of generations, or a fixed amount of time, or until we reach a minimum threshold desirability. Whatever the case in that regard, we will choose the fittest solution that remains after the last generation and use that arrangement to construct our schedule.

The tricky part with this approach is figuring out a good way of varying our candidate solutions as they pass on to the next generation.

One idea that occurs to me is that each block within the arrangement can have its own score. We can do a number of variations on a parent solution by swapping the lowest-scoring block in a parent solution with various blocks from elsewhere in the stack. We can try swapping places between the lowest and the second lowest scoring blocks (on the hope that it will improve both of them). We can also make a few child solutions that just randomly swap any two blocks within the stack (just for the sake of having more variation). The parent solutions will themselves be passed on directly as child solutions as well (in case none of the variations improves on the parent). Finally, we can introduce a small pool of entirely new arrangements at each generation (again, just for the sake of more diversity).

Using this approach, we will be able to zero in on desirable arrangements fairly quickly, even though we are only looking at a very small portion of the search space. Of course, the more generations that we run, the better our final solution will be. In effect, we will be looking for an acceptable trade-off between speed and optimization.

ootpFox07 · 03-13-2012, 10:41 PM

Layton,
Awesome idea. This sounds like it's going to be the de-facto schedule generator. Not sure how you are planning to implement this, but I would love to be able to talk about this possibly being part of the OOLM. I'm trying to provide everyone with a development starting point, so let me know once you get to that point and see if it would make any sense.

Layton is my Homeboy · 03-14-2012, 06:36 PM

Thank you. I am planning to open source the code, so you would be more than welcome to incorporate it. I will be implementing it in PHP, so it should play nice with other web apps.

That said, I am expecting that development on this could take a while.

Sent from my iPhone using Tapatalk

ootpFox07 · 03-16-2012, 09:42 AM

Quote:

Originally Posted by Layton is my Homeboy

Thank you. I am planning to open source the code, so you would be more than welcome to incorporate it. I will be implementing it in PHP, so it should play nice with other web apps.

That said, I am expecting that development on this could take a while.

Cool. The OOLM framework is in pHP/MySQL too and is based on CodeIgniter/MVC. I added a Github project called Bonfire to handle user authentication, permissions and database maintenance functionality. Yours may or may not need that, but I can maybe talk through a few ways it might help speed up your development once you get it going. And if you host it on Github, maybe I can help.

bwburke94 · 04-03-2012, 09:52 PM

Bump. Are you still working on this?

Cryomaniac · 04-07-2012, 02:46 PM

Quote:

Originally Posted by bwburke94

Bump. Are you still working on this?

I was wondering this too.

Layton is my Homeboy · 04-08-2012, 10:56 PM

Quote:

Originally Posted by bwburke94

Bump. Are you still working on this?

Yes I am. I haven't been on the forums in a bit because Skyrim is ruining my life.

Right now, what I'm really struggling with is the very large amount of memory needed to handle all this processing, especially for larger leagues. I'm trying to figure out some clever ways to really cut down on memory usage. However, if I can't get it down to a manageable amount, I may need to re-think the approach from scratch.

One thing that would be really helpful to me is some ideas around the following problem, which should be easy, but is actually surprisingly challenging.

The problem is this: how do I figure out how many games each team should play against each opponent.

This actually IS an easy problem if I make a whole bunch of assumptions about the league structure. However, I want to be able to support as many different types of league set-ups as possible. In order to make the program more robust, a generalized approach for figuring out how many teams each team plays each opponent needs to be able account for the following:

Unbalanced schedules (in which case the program needs to determine how many more divisional games there should be vs league games)
Interleague play with asymmetrical subleagues (different number of divisions, or same number of divisions but different number of teams in each division)
Schedule length that is not evenly divisible by games per opponent (for example, playing X games against some divisional opponents, and Y games against others because they using all X won't add up to the proper number of total games).
Playing a different number of home and away games against some opponents.

All of these variables add complexity to the solution. The result is that I am really struggling to figure out a good generalized approach that can be used from a simple one-division league to a complex, asymmetric, interleaguey, not-evenly divisible setup.

I'd certainly appreciate any thoughts on this issue.

rjl518 · 04-09-2012, 08:47 AM

could always use a good schedule generator.

ootpFox07 · 04-10-2012, 02:30 PM

Just remembered this. Not sure if it helps (or hinders) your efforts, but I created a very simple schedule creator in the league_model for my fantasy baseball mod. Goes through and creates (x) number of games per team and assures that each team plays unique opponents. Search for the createLeagueSchedule function (line 1148).

Layton is my Homeboy · 04-10-2012, 03:25 PM

Thanks. I'll take a look at it when I get home tonight.

Sent from my iPhone using Tapatalk

Layton is my Homeboy · 04-16-2012, 09:36 PM

I'm starting to put together some of the code to implement some of the ideas referred to earlier in this thread. As indicated above, the major constraint that I am facing is memory usage.

As an example, consider the chunk of code below. At this stage in the process, I have an array called $pairings which store all of the possible unordered team vs team matchups, such that team1 < team2. The goal is to create a complete menu of every possible unordered block (that is, an array of pairings).

I've tried to limit the search space by bounding each iteration of the loop to the greatest extent possible. However, even with a relatively small dataset (16 teams who all play one another), the memory needed to store the resulting array makes this approach unfeasible.

PHP Code:


			
while ($stepCounter <= $steps) {
  /*
  For each step (ie slot in the block), branch the current solution to include
  all of the possibilities for the next slot. Pairings are ordered by the first team
  in each slot. As a result, the possibility space is bounded by only considering
  pairings in which the lowest unassigned team is the first team in the next
  pairing to be assigned.
  */
  $newBranches = array();
  // Loop through the existing solution space and branch each solution
  foreach($branches as $branchKey => $branchValue) {
    // Identify the lowest unused team in the current solution
    $lowestUnusedTeam = min($branchValue['unusedTeams']);
    // Bound the branching space based on the lowest unassigned team
    $lowerBound = $pairingsKeys[$lowestUnusedTeam]['lower'];
    $upperBound = $pairingsKeys[$lowestUnusedTeam]['upper'];
    $pairingsCounter = $lowerBound;
    // Loop through the possible new branches
    while ($pairingsCounter <= $upperBound) {
      $unusedTeams = $branchValue['unusedTeams'];
      // Check whether team2 has already been assigned in this solution
      if (in_array($pairings[$pairingsCounter]['team2'], $unusedTeams)) {
        // This branch works - add it to the solution space
        unset($unusedTeams[$pairings[$pairingsCounter]['team1']]);
        unset($unusedTeams[$pairings[$pairingsCounter]['team2']]);
        $newBranch = array(
          'slots' => array_merge($branchValue['slots'], array($pairingsCounter)),
          'unusedTeams' => $unusedTeams,
          );
        //print_r($newBranch); //debug
        // Add this branch to the array of new branches
        $newBranches[] = $newBranch;
        unset($newBranch);
      }
      // Iterate the pairings counter
      $pairingsCounter++;
    }
  }
  // Replace the existing search space with the newly branched solutions
  $branches = $newBranches;
  unset($newBranches);
  // Iterate to the next step (ie slot of the block) and repeat the branching process
  $stepCounter++;
}

So, it's clear that creating a complete menu of every possible block is not going to be a workable solution. This leaves a few options:

Create an incomplete menu that nevertheless guarantees that we have enough options on the menu to be able to construct a workable schedule.
Instead of creating one large menu, create several mini-menus that can be combined together (that is, create some of the basic components that could be used to create a complete menu).
Scrap this approach altogether, and develop an approach that doesn't require creating a menu of blocks.

Option #1 is attractive to me, because it preserves the essence of the approach that I want to use. However, I can't think of a good way to do it, other than by using option #2.

Option #2 is possible. What I have in mind here is to create several "mini-menus" that can then be assembled together to form complete blocks. For example, I could create a mini-menu for every division that includes only divisional games, and then do the same for every required interdivisional and interleague. These menus could then be mixed and matched to create workable blocks.

There are a couple of downsides to that approach, though. One is that it has to take account of some interesting variables. For example, some of the mini-menus will be using an odd number of teams, meaning that there is an odd team out. When those mini-menus are combined, the odd team out must be assigned to play the odd team out from a different mini-menu. So, there is an additional layer of complexity when combining the menus to form a block.

Another downside is that Option #2 won't address the memory issue in some use cases. For example, if I want to create a 1-division league, there is no natural way to create mini-menus. I would simply be using the same approach as in the code above. If that 1-division league had at least 8 teams, the memory usage would be unworkably high.

Option #3 solve the memory problem. In fact, most scheduling algorithms do not follow the kind of overall approach that I have been developing in this thread.

In my approach, I start with the big (a complete menu of every feasible block) and work my way to the specific (selecting particular blocks which form a stack). The memory issues occur at the early stage, when every possibility has to be considered.

Most conventional approaches do the opposite. They start at the smallest level (particular team vs team pairings) and gradually build them up into blocks and then stacks. In this approach, the vast majority of the possibility space is never even considered, which means that memory usage is much lower in general.

The downside to reversing the approach is that it makes it much harder to apply some (but not all) of the optimization techniques that I want to use. You can still do genetic ordering of blocks into an ordered stack, but it's harder to construct optimized blocks in the first place.

So, for the time being, that is the problem that I am wrestling with. If anyone has any thoughts on how best to proceed from here, I would be very appreciative.

My inclination is that I need to re-think the overall approach. However, if I do so, I want to do it in a way that stays true to the quest of optimizing the schedule as a whole.

shawa666 · 10-08-2012, 06:21 PM

Any news on this?

Layton is my Homeboy · 12-09-2012, 11:54 AM

Quote:

Originally Posted by shawa666

Any news on this?

Not yet, sorry.

Marinersfan51 · 11-12-2013, 07:36 PM

Is this still being worked on or did it die?

03-07-2012, 02:11 PM	#4
Layton is my Homeboy Major Leagues Join Date: Apr 2006 Location: Winnipeg, Mb Posts: 429	A Basic Approach So, with some of that background out of the way, it's time to start thinking about the basic approach to building a schedule algorithmically. There are a number of different approaches that could be used. Personally, my thinking is heavily informed by GMO's Making Baseball Schedules blog. What I have come up with is an adaptation of the basic approach described on that blog. What I like about GMO's approach is that it breaks down a complex problem into a series of simpler problems. The solutions to those simpler problems become building blocks that allow you to solve more complex problems. Over the past several weeks of trial-and-error programming, the major steps that I have identified are as follows: Step 0: Gather schedule parameter's based on the user's input. Step 1: Based on the schedule parameters, figure out how many games each team should play against each opponent (that is, create the games matrix and the series matrix). Step 2: Create a menu of every feasible block using the pairings defined in the games/series matrix. In other words, find every feasible arrangement of games such that every team is playing. In doing so, assign a desirability score to each block in the menu. Step 3: Use the menu of feasible blocks to construct a feasible unordered stack (that is, an arrangement of blocks such that the use up all and only the series in the series matrix) that optimizes the selection of desirable blocks. Step 4: Assign home and away values and series lengths to the pairings in the blocks in the stack, such that each team plays an appropriate number of home and away games against each opponent. Step 5: Arrange the blocks in the stack in a way that optimizes desirable traits. Step 6: Apply finishing touches such as game times, series offsets, etc. In subsequent posts, I will describe the basic approaches that I am using at each step in this process (not necessarily in order) with a view to soliciting some advice and feedback to see if the approaches that I am using can be improved upon. Last edited by Layton is my Homeboy; 03-07-2012 at 09:54 PM. Reason: Formatting

03-08-2012, 09:12 PM	#7
Layton is my Homeboy Major Leagues Join Date: Apr 2006 Location: Winnipeg, Mb Posts: 429	Step 5: Optimize the Block Order I am going to skip step 4, as that step is very easy to do. In that step, we assign home and away values and series lengths to the pairings, transforming them into proper series. STEP 5: Optimize Block Order At this stage, we have an unordered stack full of blocks. Each block consists of series. Each series consists of a home team, an away team, and a series length. We now have all the information we need to sort the blocks into an order that results in a desirable schedule. The goal at this stage in the process is to determine an optimal or near-optimal arrangement of blocks. An optimal arrangement is one that maximizes desirable schedule features and minimizes undesirable schedule features. Some of these features are described earlier in this thread. However, I will repeat some of them here because they will inform how we approach this task. Desirable Features: Divisional games late in the season. Divisional games on opening day. Teams play roughly the same number of weekend home series. Series ending on a Sunday. Undesirable Features: Repeaters and semi-repeaters (that is, back-to-back series between the same two teams). Excessively long roadtrips. Interleague games late in the season. Too much travel (defined for the purposes of this exercise as playing consecutive away series against teams in different divisions). Too much time between days off. In effect, we are going to compare possible arrangements of blocks by assigning each arrangement a score. Arrangements will get a positive score if they feature desirable traits and a negative score if they feature undesirable traits. To find an optimal arrangement, we could just try out every possible arrangement, figure out how desirable it is, and then use the most desirable one (ie a brute force alogorithm). Unfortunately, this is not a viable solution as it will take up far too much memory and processing time. For example, let's suppose that we have 52 blocks (that is what a modern MLB-type schedule uses - a 26-week schedule times 2 block per week). There are 52! (that is 52 x 51 x 50 x 49 etc.) different orders in which we can arrange those blocks, or about 8.06581752 × 10 ^ 67. Obviously, this is not going to work. So, we have two basic options. The first option is to use an alogrithm that limits the search space while still guaranteeing us the optimal block order (for example, the branch-and-bound technique that we used in Step 3). The second option is to use an algorithm that drastically limits the search space, but only gives us a near-optimal result. In this case, I am inclined to use a technique that drastically cuts down the spearch space, at the expense of not being guaranteed the optimal solution. This particular problem is not well-suited to a branch-and-bound algorithm, because it is difficult to come up with an effective bounding function. A branch-and-bound solution without a good bounding function won't be very effecient, because it will fail to "prune" bad candidates and therefore it won't limit the search space effectively. My inclination in this situation is to use a technique that will be much more efficient, while still giving us a near-optimal solution. In particular, I am going to use a genetic alorithm for this stage of the process. The basic idea behind genetic algorithms is to mimic the process of biological evolution. So, we will start with a random population of candidate solutions. We will assign scores to each solution, based on the desirability criteria that we defined earlier. Then, we will drop the least "fit" (ie desirable) solutions from the field. The remaining solutions will become the "parents" for the next generation of solutions. For each parent, we will create a set of "child" solutions, each of which is very similar to its parent, but with a slight variance. Then, we will repeat the process of discaring the least fit solutions, and move on to the next generation. We can repeat this process for a fixed number of generations, or a fixed amount of time, or until we reach a minimum threshold desirability. Whatever the case in that regard, we will choose the fittest solution that remains after the last generation and use that arrangement to construct our schedule. The tricky part with this approach is figuring out a good way of varying our candidate solutions as they pass on to the next generation. One idea that occurs to me is that each block within the arrangement can have its own score. We can do a number of variations on a parent solution by swapping the lowest-scoring block in a parent solution with various blocks from elsewhere in the stack. We can try swapping places between the lowest and the second lowest scoring blocks (on the hope that it will improve both of them). We can also make a few child solutions that just randomly swap any two blocks within the stack (just for the sake of having more variation). The parent solutions will themselves be passed on directly as child solutions as well (in case none of the variations improves on the parent). Finally, we can introduce a small pool of entirely new arrangements at each generation (again, just for the sake of more diversity). Using this approach, we will be able to zero in on desirable arrangements fairly quickly, even though we are only looking at a very small portion of the search space. Of course, the more generations that we run, the better our final solution will be. In effect, we will be looking for an acceptable trade-off between speed and optimization. Last edited by Layton is my Homeboy; 03-14-2012 at 07:46 PM. Reason: Fixed wording

03-13-2012, 10:41 PM	#8
ootpFox07 All Star Starter Join Date: Dec 2005 Posts: 1,674	Layton, Awesome idea. This sounds like it's going to be the de-facto schedule generator. Not sure how you are planning to implement this, but I would love to be able to talk about this possibly being part of the OOLM. I'm trying to provide everyone with a development starting point, so let me know once you get to that point and see if it would make any sense. __________________ My OOTP Gaming Channels: GamerFox015 YouTube Channel Twitch.tv channel for live sims My OOTP Mods: OOTP Fantasy Leagues - Version 1.1 Build and run Fantasy leagues from your online league! Available for OOTP 21. New Web site

04-09-2012, 08:47 AM	#14
rjl518 Hall Of Famer Join Date: Mar 2007 Location: Born in Shea Stadium, lives in LoanDepot Park. Posts: 6,240	could always use a good schedule generator. __________________ My Threads: MLB Project 32 by SFGiants58 "Colon looking for his 1st hit of the year and he DRIVES ONE! Deep left field! Back goes Upton! Back near the wall! ITS OUTTA HERE!!! Bartolo has done it!!! THE IMPOSSIBLE HAS HAPPENED!!! This is one of the great moments in the history of baseball! Bartolo Colon has gone deep!" ---Gary Cohen. (May 7, 2016) (Petco Park) NYM 6 @ SD 3

04-10-2012, 02:30 PM	#15
ootpFox07 All Star Starter Join Date: Dec 2005 Posts: 1,674	Just remembered this. Not sure if it helps (or hinders) your efforts, but I created a very simple schedule creator in the league_model for my fantasy baseball mod. Goes through and creates (x) number of games per team and assures that each team plays unique opponents. Search for the createLeagueSchedule function (line 1148). __________________ My OOTP Gaming Channels: GamerFox015 YouTube Channel Twitch.tv channel for live sims My OOTP Mods: OOTP Fantasy Leagues - Version 1.1 Build and run Fantasy leagues from your online league! Available for OOTP 21. New Web site

03-06-2012, 09:18 PM	#1
Layton is my Homeboy Major Leagues Join Date: Apr 2006 Location: Winnipeg, Mb Posts: 429	Creating a Web-Based Schedule Generator I am in the process of developing a web-based schedule generator for OOTP. This project is not yet complete. In fact, it will likely take me some time to finish it. Creating good baseball schedules for an MLB-type of league is a notoriously difficult task. Several excellent papers have been written (some available online, others not) which describe various algorithmic solutions for creating desirable MLB schedules. However, despite a large amount of academic and professional attention being paid to this problem it remains difficult to solve in a way that is both efficient (in terms of processing) and optimal (in terms of maximizing desirable features in the resulting schedule). The task of creating a schedule generator for OOTP is even harder. OOTP supports a wide variety of league formats (division structure, number of teams, etc.), schedule lengths (number of games), schedule formats (interleague/no interleague), game distributions (balanced/unbalanced), etc. As a result, a generalized schedule generator for OOTP needs to be flexible enough to handle a large number of use cases. Algorithms that work well for a modern MLB setup need to be generalized to work for very different types of leagues and schedules. The purpose of this thread is to gather some advice and feedback from the OOTP community that will help me to work through some of the programming and design challenges that I am facing in my effort to develop a schedule generator. I appreciate any help that anyone is willing to offer, but I am particularly interested in getting some input from the following categories of people: Those who have experience making OOTP schedules by hand. Those who are knowledgeable about the paramaters and constraints that are used to create actual schedules for the MLB and other leagues (both now and historically). Those with computer programing skills who can offer some advice in terms of the algorithms used to implement the solution design. In subsequent posts, I will get much more specific about the type of information and assistance that I need. However, before I come to that, there is some background information that will be useful to this discussion (which will be posted shortly). For now, I strongly encourage anyone who is interested in this thread to read GMO's Making Baseball Schedules blog, in its entirety (there are only 6 posts on the blog, and the whole thing can be read in an hour). It is an extremely well-written description of some of the steps that GMO takes when creating a schedule for OOTP. It will serve as useful background reading for the purposes of this thread.

03-06-2012, 09:57 PM	#2
Layton is my Homeboy Major Leagues Join Date: Apr 2006 Location: Winnipeg, Mb Posts: 429	What Makes a Good Schedule? What makes a good schedule? This is the first question that I want to ask the community. OOTP will generate perfectly fine schedules all on its own. So, why bother going to all the effort of creating a schedule generator outside of the game? The basic reason is that the schedules OOTP generates are not always completely satisfactory. So, part of the goal of our schedule generator will be to emphasize traits that make a schedule desirable and to de-emphasize traits that a make a schedule undesirable. So, what makes a good schedule? I think that question has to be answered using two lists. The first is a list of absolute requirements. A schedule that does not meet these requirements cannot be used. The second is a list of desirable traits. A good schedule is one that conforms as much as possible with these desirable traits. As part of our solution design, we will be implementing combinatorial optimization algorithms that will ensure that we are likely to get a schedule that is both good and feasible. I am interested to get some thoughts from the many great OOTP schedule-makers as to what should be included on these two lists. In addition, I would like to get a sense of which items in the "desirable" list are the most important. When we apply our optimization functions, we will assign "scores" to possible schedules based on the degree to which they conform the desirable traits. We can therefore assign weights to each of them relative to their importance. To get the ball rolling, here are a few items that I consider important. Requirements (Must-Haves) Teams play the same number of games as each other. Each team plays the same number of home and away games (with the consequence that each team must play an even number of games throughout the season). Each team plays all and only the opponents they are supposed to play (as defined by the league structure and schedule parameters). Each team plays an appropriate number of games against each opponent (according to the league structure and schedule parameters). Desirable Traits (Good-to-Haves) If possible, each teams plays an equal number of home and away games against each opponent. Series are arranged so that they regularly end on a Sunday. Teams play roughly the same number of home games on weekends. No team plays too many consecutive series away from home (ideally, three series away from home is the maximum). Teams do not play consecutive series against the same opponent (ie semi-repeaters). Travel time is minimized.* Game times are approrpriate to the era and games scheduled on the same day do not all start at the same time. Teams do not all take an off-day on the same day (ie there is at least one game for every day in the schedule, other than during an all-star break).* Teams do not have too many days in between days off. The end of the season features more divisional games. Opening day features mostly divisional games. Interleague games are avoided late in the season.** Comments** A few of these items are marked with asterisks. These items merit some further comment. In a real-world scheduling application, we would attempt to minimize travel time by figuring out the actual, physical distance between each of the teams in the league. When creating a schedule for OOTP, it isn't possible to get to that level of precision. When creating a schedule in OOTP's XML schedule format, each team is represented by a number. The specific team that is assigned to a particular number rotates year by year. So, when making the schedule, you don't necessarily know which specific team is "team 1" will be. You only know which division team 1 is in. So, for the purposes of minimizing travel time in OOTP, the best we can do is to try to avoid playing consecutive away series against opponents from different divisions (on the assumption that the divisions have geographic significance). As LGO pointed out here, there is a lack of historical data regarding specific game start times. Retrosheet has some data available that gives a simple breakdown of day vs. night games year-by-year, but the data are not more refined than that. *This is mainly an aesthetic preference, but I think that it is probably the biggest complaint that most people have about the schedules that OOTP generates. An OOTP-generated schedule typically has several days throughout the year where every* team takes an off day. This is not realistic and I personally find it unsatisfying. I prefer to see 1/2 games back in the standings, and teams with games in hand against their division rivals. I think that it makes pennant races more interesting and dynamic. It is also more true to life. ****The current MLB schedule format puts most interleague games in the first half of the season. When the Astros move to the AL West, this will change, as there will have to be at least one interleague series at all times during the season. However, unless we are required to do this (as a result of having two subleagues that both have an odd number of teams), we will avoid late-season interleague play as much as possible.

03-06-2012, 11:33 PM	#3
Layton is my Homeboy Major Leagues Join Date: Apr 2006 Location: Winnipeg, Mb Posts: 429	Some Terminology As I step through my thoughts on a solution to the scheduling problem, it will be helpful to define some terminology. This way, we will all be on the same page when discussing certain topics. Schedule parameters are the options and settings defined by the user that dictate a number of requirements for the schedule. Parameters include the number of teams in the league, the league structure, the number of games played by each team, whether the schedule includes interleague play, whether the schedule is balanced or unbalanced, the typical series length, etc. A team means a team in the league. When we are creating schedules, each team is represented by an integer. A set is ordered if the arrangement of the members of the set in relation to one another has some significance. A set is unordered if the arrangement of the members of set does not have any significance. A pairing is a unordered pair of teams. For example, the pairing of teams 1 and 2 is represented by the set {1,2}. Because a pairing is unordered, the pairing {1,2} is equivalent to the pairing {2,1}. In effect, a pairing represents a series, before we determine which team is the home team and which is the away team, or the number of games in the series. A pairing is feasible if the teams in the pairing are supposed to play each other during the season. A series is a pairing of teams, where one of the teams is designated as the "home" team and the other is designated as the "away" team. A series also has an additional value: the number of games in the series. Series are the basic building blocks a schedule. A series (or a pairing) can be a divisional series, a league series, or an interleague series. A divisional series is one in which both teams are in the same division. A league series is one in which both teams are in the same subleague, but they are not in the same division. An interleague series is one in which the two teams are not in the same subleague. The games matrix is a notional table that represents the number of games that each team is supposed to play against each other team during the season (which is determined from the league parameters). Similarly, the series matrix is a notional table that represents the number of series that each team is supposed to play against each other team during the season. A block is a set of pairings, such that every team in the league appears exactly once in one of the pairings in the block. For example, if there are four teams in the league, then one possible block would be {{1,2},{3,4}}. A block is unfinished if it consists of pairings. A block is finished if it consists of series (that is, pairings where the home and away values and a series length has been assigned). A feasible block is one in which all of the pairings are feasible. There is one special block, which is the all-star block. If the league parameters call for an all-star break and game to be scheduled, the special all-star block will be inserted into the schedule in an appropriate place. A stack (building on the block metaphor) is a set of blocks. A feasible stack is one in which the set of all of the pairings/series in all of the blocks matches up with the the series matrix. In other words, a feasible stack basically represents a set of blocks that will result in a schedule that conforms the to league parameters. An unordered stack is one in which the order of the blocks has not yet been determined. An ordered stack is one in which the blocks have been arranged into a particular order (generally, after an optimization process). Last edited by Layton is my Homeboy; 03-07-2012 at 10:08 PM. Reason: Added additional definitions

03-14-2012, 06:36 PM	#9
Layton is my Homeboy Major Leagues Join Date: Apr 2006 Location: Winnipeg, Mb Posts: 429	Thank you. I am planning to open source the code, so you would be more than welcome to incorporate it. I will be implementing it in PHP, so it should play nice with other web apps. That said, I am expecting that development on this could take a while. Sent from my iPhone using Tapatalk

04-03-2012, 09:52 PM	#11
bwburke94 Hall Of Famer Join Date: Jun 2008 Location: Belchertown, MA, USA Posts: 4,444	Bump. Are you still working on this?

04-10-2012, 03:25 PM	#16
Layton is my Homeboy Major Leagues Join Date: Apr 2006 Location: Winnipeg, Mb Posts: 429	Thanks. I'll take a look at it when I get home tonight. Sent from my iPhone using Tapatalk

04-16-2012, 09:36 PM	#17
Layton is my Homeboy Major Leagues Join Date: Apr 2006 Location: Winnipeg, Mb Posts: 429	Memory, memory, memory I'm starting to put together some of the code to implement some of the ideas referred to earlier in this thread. As indicated above, the major constraint that I am facing is memory usage. As an example, consider the chunk of code below. At this stage in the process, I have an array called $pairings which store all of the possible unordered team vs team matchups, such that team1 < team2. The goal is to create a complete menu of every possible unordered block (that is, an array of pairings). I've tried to limit the search space by bounding each iteration of the loop to the greatest extent possible. However, even with a relatively small dataset (16 teams who all play one another), the memory needed to store the resulting array makes this approach unfeasible. PHP Code: while ($stepCounter <= $steps) { /* For each step (ie slot in the block), branch the current solution to include all of the possibilities for the next slot. Pairings are ordered by the first team in each slot. As a result, the possibility space is bounded by only considering pairings in which the lowest unassigned team is the first team in the next pairing to be assigned. / $newBranches = array(); // Loop through the existing solution space and branch each solution foreach($branches as $branchKey => $branchValue) { // Identify the lowest unused team in the current solution $lowestUnusedTeam = min($branchValue['unusedTeams']); // Bound the branching space based on the lowest unassigned team $lowerBound = $pairingsKeys[$lowestUnusedTeam]['lower']; $upperBound = $pairingsKeys[$lowestUnusedTeam]['upper']; $pairingsCounter = $lowerBound; // Loop through the possible new branches while ($pairingsCounter <= $upperBound) { $unusedTeams = $branchValue['unusedTeams']; // Check whether team2 has already been assigned in this solution if (in_array($pairings[$pairingsCounter]['team2'], $unusedTeams)) { // This branch works - add it to the solution space unset($unusedTeams[$pairings[$pairingsCounter]['team1']]); unset($unusedTeams[$pairings[$pairingsCounter]['team2']]); $newBranch = array( 'slots' => array_merge($branchValue['slots'], array($pairingsCounter)), 'unusedTeams' => $unusedTeams, ); //print_r($newBranch); //debug // Add this branch to the array of new branches $newBranches[] = $newBranch; unset($newBranch); } // Iterate the pairings counter $pairingsCounter++; } } // Replace the existing search space with the newly branched solutions $branches = $newBranches; unset($newBranches); // Iterate to the next step (ie slot of the block) and repeat the branching process $stepCounter++; } So, it's clear that creating a complete* menu of every possible block is not going to be a workable solution. This leaves a few options: Create an incomplete menu that nevertheless guarantees that we have enough options on the menu to be able to construct a workable schedule. Instead of creating one large menu, create several mini-menus that can be combined together (that is, create some of the basic components that could be used to create a complete menu). Scrap this approach altogether, and develop an approach that doesn't require creating a menu of blocks. Option #1 is attractive to me, because it preserves the essence of the approach that I want to use. However, I can't think of a good way to do it, other than by using option #2. Option #2 is possible. What I have in mind here is to create several "mini-menus" that can then be assembled together to form complete blocks. For example, I could create a mini-menu for every division that includes only divisional games, and then do the same for every required interdivisional and interleague. These menus could then be mixed and matched to create workable blocks. There are a couple of downsides to that approach, though. One is that it has to take account of some interesting variables. For example, some of the mini-menus will be using an odd number of teams, meaning that there is an odd team out. When those mini-menus are combined, the odd team out must be assigned to play the odd team out from a different mini-menu. So, there is an additional layer of complexity when combining the menus to form a block. Another downside is that Option #2 won't address the memory issue in some use cases. For example, if I want to create a 1-division league, there is no natural way to create mini-menus. I would simply be using the same approach as in the code above. If that 1-division league had at least 8 teams, the memory usage would be unworkably high. Option #3 solve the memory problem. In fact, most scheduling algorithms do not follow the kind of overall approach that I have been developing in this thread. In my approach, I start with the big (a complete menu of every feasible block) and work my way to the specific (selecting particular blocks which form a stack). The memory issues occur at the early stage, when every possibility has to be considered. Most conventional approaches do the opposite. They start at the smallest level (particular team vs team pairings) and gradually build them up into blocks and then stacks. In this approach, the vast majority of the possibility space is never even considered, which means that memory usage is much lower in general. The downside to reversing the approach is that it makes it much harder to apply some (but not all) of the optimization techniques that I want to use. You can still do genetic ordering of blocks into an ordered stack, but it's harder to construct optimized blocks in the first place. So, for the time being, that is the problem that I am wrestling with. If anyone has any thoughts on how best to proceed from here, I would be very appreciative. My inclination is that I need to re-think the overall approach. However, if I do so, I want to do it in a way that stays true to the quest of optimizing the schedule as a whole.

10-08-2012, 06:21 PM	#18
shawa666 All Star Reserve Join Date: Apr 2010 Posts: 601 Infractions: 0/1 (1)	Any news on this?

11-12-2013, 07:36 PM	#20
Marinersfan51 All Star Reserve Join Date: Dec 2009 Posts: 774	Is this still being worked on or did it die?