Grow a Decision Tree to Support Decision-Making, Machine Learning
ISE Magazine August 2019 Volume: 51 Number: 8
By Alaa Kafaﬁ
A decision tree, a special form of tree diagram, has been a popular tool among managers and professionals for a long time. They use it for making a correct decision or to ﬁnd a solution to an issue that arises repeatedly. Since this tool is user-friendly, its use has extended to the area of machine learning, aka decision tree analysis, and the reason for revisiting this tool now.
Uses for a decision tree include supporting the decision-making process, ﬁnding a solution to a repeatable problem, training computers from the data and developing a predictive model, along with encoding the work rules that can be applied automatically by computers.
We will address the two main uses of decision tree through two examples. The ﬁrst is about a delay in delivering engineering projects, which was experienced by a global engineering company. The second is about Chase Bank’s mortgage risk, which has been excerpted from Eric Siegel’s book “Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die.” In addition, we will discuss brieﬂy machine learning and its in-tended use in predicting the future individual’s behavior from past data (predictive analytics; see related article at right).
As we will see, creating the decision tree is simple. We de-velop a statement of the goal or problem in question, then ask sequential questions that lead to the next level of details until we come to the correct decision or solution at the end point. Our examples will show how the tree is drawn. Finally, we will test the tree for its ability to lead to the correct decision in different scenarios.
Supporting the decision-making process
Let’s see a hands-on example of using a decision tree to support the process of decision-making. Then, we will see that the principles in this example are also valid for the process of machine learning.
A delay in delivering a project usually has consequences for both the client and contractors; In the oil and gas industry, for instance, the client was previously obligated to supply gas, oil or other products to the customers just after completion of the project. Accordingly, non-completion of the project on schedule will leave the client in bad shape. Another example to cite is a delay in time-to-market for a newly developed product that will put the manufacturer be-hind the competitors.
We had a major engineering project that was suffering a delay in schedule; the root cause was recognized as turnover among a project’s team members. This cause had been so severe the project team was changed completely more than once. The idea behind using the decision tree was to construct a tool to support the project manager taking the proper action when he or she noticed a pattern that could cause a delay in the project delivery. Accordingly, to construct the decision tree a team of managers and experienced engineers went through many brainstorming sessions to ﬁnd all the reasons that led to a delay in the schedule for previous projects.
Let me show you the problem’s delay in project delivery and the four main causes: the project is understaffed, frequent changes in the project scope, quality problems that lead to rework and contingencies.
In the ﬁrst main cause, employee turnover and other factors led the project to be understaffed. It was obvious that policies for retaining and motivating employees should be upgraded. Besides the beneﬁts offered to employees, we decided to hold a team-building meeting every month. This tea party was given to recognize and reward good performances.
In addition, we agreed on conducting an employee satisfaction survey semi-annually to better understand the employees’ perception of value. Finally, a succession plan for each project was developed. In case a key team member left, a known re-placement would take over smoothly.
Fluctuations in the need for resources was another contributor to the issue. In order to level the resources during the execution of the project, we recommended to recruit contractors during the peak need times along with engineers who were between projects. The latter were needed to level the resources through the entire company by starting different projects on a staggered schedule. Resources are overloaded, a factor shown in the same branch of the tree; when an engineer was working on more than one project, his or her productivity was less than an engineer devoted to a single project. Since coordination with each project team consumed a big portion of the working time, we came to a consensus to limit the contribution of a single engineer to only two projects.
The frequent change of the project’s scope (the second branch of the tree) was another main cause to be off schedule. We found that this cause was greatly attributed to miscommunications with the client. The remedy was to develop a documented communication plan for each phase and to ensure that by completing the front-end engineering design stage (a higher-level design phase not affected much by changes) and commencing the detailed engineering phase. By then, we al-ready had agreed with the client on about 90% of the project’s scope; the remaining 10% was considered as an allowance for contingencies. In addition, we recommended to allocate spare resources to meet minor changes in the project’s scope.
The third branch of the tree was dealing with quality problems that led to the project’s reworking and drove it off schedule.
In case non-conformances were to standards, codes and regulations, a technical review was the tool to ﬁnd those design errors and correct them. If the project team was technically competent but weak in process, then a quality audit was the tool to disclose noncompliance to internal procedures (quality and operating procedures) and complete corrective actions. We recommended to increase the frequency of reviews or audits according to the apparent non-conformances.
Also, project managers were advised to hold “lunch ’n learn” sessions to train on weak technical areas and ensure that all the project team members went through training on internal procedures. Training on quality procedures was usually a part of the onboarding training to the engineers, while training on operational procedures was held just after forming the project team.
Our quality audits to the projects revealed a major cause for not complying with the operating procedures: newcomers who were usually from similar engineering companies came to us with their experience from the previous employers’ culture. They found that it was easy doing the work the way they used to rather than considering the culture of the new employer. Indeed, the culture of the business is the reason be-hind its existence and continuation; we can’t impose the culture of another ﬁrm on our business even if this ﬁrm is a leader in the industry. Accordingly, project managers were recommended to monitor newcomers’ way of executing the work and to educate them about the company’s culture.
One way to place countermeasures against contingencies – the fourth main cause of the issue, which were events not considered during the planning phase – was to refer to the les-sons learned from similar projects to identify those events ﬁrst. We usually went through lessons learned at the planning phase and placed countermeasures to the documented contingencies from previous projects. Then when closing the project, we started discussing and documenting the new lessons.
We found that holding a quarterly meeting to discuss the unplanned events would help decrease the effects of contingencies rather than waiting until completion of the project. During these meetings, management and representatives from other projects attended with the project team discussing what went well and what did not go as planned and the appropriate measures to keep the project on schedule. These meetings were a good opportunity to share knowledge and spread good practices among all the company’s projects. But the main ad-vantage of increasing the frequency of these meeting was the occurrence of unpredictable events (black swans) and we pre-pared to better cope with them. Figure 1 shows the outcome of the above analysis with 14 decisions in red.
As we saw from this example of a decision tree, when we go deep in analysis by asking questions at each level of detail and move from one level to another, the decisions taken may resolve other problems than the one in question. For example, the succession plans developed were intended to solve the issue of employee turnover, but they also resolved the cases of sickness and maternity leaves, which were considered contingencies. Also, the policies for retaining and motivating project team members could beneﬁt the employees of the supporting functions.
This also illustrates that testing of the tree was successful. When we developed the tree, the issue was a delay in schedule and the cause was known (engineer turnover), then we added many causes based on past experience. In each scenario, the tree worked well and led the user to the appropriate decision.
Finally, we might miss a variable or more that affected the issue under analysis, either because the variable was insigniﬁcant or it was not repeatable, as the considered variables were. However, as long as the tree led the user to the correct decision at every scenario, this meant that testing of the tree was successful and the tree will do well in general.
We could not calculate the monetary value of this exercise. However, customer satisfaction surveys showed that customers became more satisﬁed when seeing their projects on schedule. In the following year after introducing this decision tree, a potential client agreed to enter into a long-term alliance with us through a service agreement. We then enjoyed working with this customer without bidding, which was an indication of the high loyalty of our clients.
When we performed this exercise, the team searched only one database set, lessons learned, of one geographical area, North America, and for the last three years. Suppose we ex-tended the search to all project database sets of all geographical areas for the last 30 years in order to not miss any variable that may affect the issue in question? Then the exercise be-comes beyond human capabilities, but this kind of analysis is common in the big data era in which we live. Here comes the role of machines as we will see from the following example.
Since we will address the use of decision tree in the area machine learning, it’s worth explaining this term brieﬂy. Machine learning’s task is to ﬁnd patterns that appear in the data, so that what is learned will hold true in situations never yet encountered. Accordingly, machine learning processes training data to produce a predictive model. Then this model takes the characteristics of the individual as input and provides a predictive score as output. The higher the score, the more likely it is the individual will exhibit the predicted behavior.
In short, the predictive model is a mechanism that predicts the behavior of an individual, such as buying a product, clicking an ad or prepaying a mortgage, as the following ex-ample will show
We can now deﬁne the overarching technology within which machine learning works: predictive analytics (PA), technology that learns from experience (data) to predict the future behavior of individuals in order to drive better decisions. The alternative risk-oriented deﬁnition of PA is technology that learns from experience (data) to manage micro risk. This illustrates the capabilities and limitations of PA technology.
It is usually what individuals have done that predicts what they will do, and people who have done something a lot are more likely to do it again. So PA feeds demographic data about the individual’s gender, education, location, age, etc., with behavioral predictors such as frequency, purchases, ﬁnancial activity and product usage, such as calls and web surﬁng. These behaviors are often the most valuable; it’s always a behavior that we seek to predict.
On the other hand, PA applications cannot be used to man-age macro risks, for that reason; we could not know or pre-vent occurrence of the global ﬁnancial crisis in 2008 or other “black swan” events.
While the terms discussed are self-explanatory, the following example will explain more these concepts.
Management tools and analytics have been in use for long time, but they have come to the light again in this era of technology revolution and become differentiation tools for the businesses.
In addition to the classical uses of decision tree – supporting decision-making processes and ﬁnding a solution to a repeated problem – the new uses include training computers from data and developing a predictive model that predicts individuals’ future behavior. Besides encoding work rules that applied automatically by the machine, since every path of the developed tree is a work rule within the context of the issue analyzed
This use of management tools and analytics has become a competitive advantage in the era of the technology revolution in which we live.