Discover millions of ebooks, audiobooks, and so much more with a free trial

Only $11.99/month after trial. Cancel anytime.

Decision Tree Pruning: Fundamentals and Applications
Decision Tree Pruning: Fundamentals and Applications
Decision Tree Pruning: Fundamentals and Applications
Ebook155 pages2 hours

Decision Tree Pruning: Fundamentals and Applications

Rating: 0 out of 5 stars

()

Read preview

About this ebook

What Is Decision Tree Pruning


In machine learning and search algorithms, pruning is a data compression approach that minimizes the size of decision trees by deleting sections of the tree that are non-critical and redundant to classify instances. This reduces the amount of data that has to be stored in the tree. The prediction accuracy is improved as a result of the reduction in overfitting brought about by the use of pruning, which brings about a simplification of the final classifier.


How You Will Benefit


(I) Insights, and validations about the following topics:


Chapter 1: Decision Tree Pruning


Chapter 2: Decision Tree Learning


Chapter 3: Data Compression


Chapter 4: Alpha-Beta Pruning


Chapter 5: Null-Move Heuristic


Chapter 6: Horizon Effect


Chapter 7: Minimum Description Length


Chapter 8: Bayesian Network


Chapter 9: Ensemble Learning


Chapter 10: Artificial Neural Network


(II) Answering the public top questions about decision tree pruning.


(III) Real world examples for the usage of decision tree pruning in many fields.


Who This Book Is For


Professionals, undergraduate and graduate students, enthusiasts, hobbyists, and those who want to go beyond basic knowledge or information for any kind of decision tree pruning.


What is Artificial Intelligence Series


The artificial intelligence book series provides comprehensive coverage in over 200 topics. Each ebook covers a specific Artificial Intelligence topic in depth, written by experts in the field. The series aims to give readers a thorough understanding of the concepts, techniques, history and applications of artificial intelligence. Topics covered include machine learning, deep learning, neural networks, computer vision, natural language processing, robotics, ethics and more. The ebooks are written for professionals, students, and anyone interested in learning about the latest developments in this rapidly advancing field.
The artificial intelligence book series provides an in-depth yet accessible exploration, from the fundamental concepts to the state-of-the-art research. With over 200 volumes, readers gain a thorough grounding in all aspects of Artificial Intelligence. The ebooks are designed to build knowledge systematically, with later volumes building on the foundations laid by earlier ones. This comprehensive series is an indispensable resource for anyone seeking to develop expertise in artificial intelligence.

LanguageEnglish
Release dateJun 28, 2023
Decision Tree Pruning: Fundamentals and Applications

Read more from Fouad Sabry

Related to Decision Tree Pruning

Titles in the series (100)

View More

Related ebooks

Intelligence (AI) & Semantics For You

View More

Related articles

Reviews for Decision Tree Pruning

Rating: 0 out of 5 stars
0 ratings

0 ratings0 reviews

What did you think?

Tap to rate

Review must be at least 10 words

    Book preview

    Decision Tree Pruning - Fouad Sabry

    Chapter 1: Decision tree pruning

    In machine learning and search algorithms, pruning is a data compression approach that minimizes the size of decision trees by deleting portions of the tree that are non-critical and redundant to categorize instances. This reduces the amount of data that has to be stored in the tree. The final classifier becomes simpler as a result of pruning, which in turn leads to an increase in predicted accuracy as a result of a decrease in overfitting.

    When developing an algorithm using a decision tree, one of the concerns that often arises is how big the final tree ought to be. A tree that is too big runs the risk of overfitting the training data and generally performing badly when applied to fresh samples. It's possible that crucial structural information about the sample space won't be captured by a little tree. However, it is difficult to determine when a tree algorithm should cease since it is impossible to determine whether the addition of a single more node would significantly lower error. This makes it difficult to determine when a tree algorithm should stop. The horizon effect is the name given to this particular issue. The tree should be grown until each node has a limited number of instances, and then it should be pruned in order to eliminate any nodes that do not contribute any further information. This is a popular method.

    A learning tree's size should be reduced after pruning, but the tree's predicted accuracy should not suffer as a result, as determined by a cross-validation set. There are a variety of approaches to tree trimming, each of which employs a unique set of measurements in order to get optimal results.

    The act of pruning may be broken down into two distinct categories (pre- and post-pruning).

    The pre-pruning processes replace a stop () condition in the induction algorithm (for example, the maximum tree depth or information gain (Attr)> minGain). This prevents a full induction of the training set from occurring. Pre-pruning techniques are said to be more effective than traditional ones since they do not result in the trees experiencing a full set, but rather cause them to stay tiny right from the beginning. The horizon effect is a challenge that is shared by the many prepruning approaches. This is to be viewed as the stop () criteria bringing about an untimely premature cessation of the induction, which was not desired.

    The most popular method for simplifying trees is called post-pruning, however it may also simply be called pruning. In this instance, nodes and subtrees have been substituted by leaves in order to simplify the structure. The size of previously undetected items may be greatly reduced by pruning, which can also increase their accuracy as a classification tool. While it's possible that the accuracy of the assignment printed on the train set may worsen with time, it's more likely that the accuracy of the classification attributes printed on the tree will improve overall.

    The processes are categorized differently according on the approach that they take in the tree (top-down or bottom-up).

    The execution of these methods begins at the tree's terminal node (the lowest point). They work their way upwards via a recursive process and decide the importance of each individual node. If it cannot be determined how relevant the node is to the categorization, it will either be removed or replaced with a leaf. Using this approach has the distinct benefit of ensuring that no relevant subtrees are removed in the process. Reduced Error Pruning (REP), Minimum Cost Complexity Pruning (MCCP), and Minimum Error Pruning are some of the strategies that fall under this category (MEP).

    In contrast to the strategy that works from the bottom up, this approach begins at the very beginning of the tree. In accordance with the structure that is shown below, a relevance check is performed, the result of which determines whether or not a node is pertinent to the categorization of all n items. When you prune a tree at an inner node, it is possible that you will remove a whole subtree, regardless of how relevant that subtree may have been. Pessimistic error pruning (PEP) is one of these representations, and it achieves fairly decent results even with items that have not yet been viewed.

    Reduced error pruning is one of the most straightforward approaches of pruning. Each node, beginning with the leaves, has its least popular class substituted with its most popular child node. If there is no discernible impact on the accuracy of the forecast, the modification will be preserved. Reduced error pruning, despite its lack of sophistication, offers the benefits of being straightforward and quick.

    Cost complexity pruning generates a series of trees {\displaystyle T_{0}\dots T_{m}} where T_{0} is the initial tree and T_{m} is the root alone.

    At step i , the tree is created by removing a subtree from tree i-1 and replacing it with a leaf node with value chosen as in the tree building algorithm.

    The following criteria are used to choose the subtree that will be removed::

    Define the error rate of tree T over data set S as {\displaystyle \operatorname {err} (T,S)} .

    The subtree t that minimizes

    {\displaystyle {\frac {\operatorname {err} (\operatorname {prune} (T,t),S)-\operatorname {err} (T,S)}{\left\vert \operatorname {leaves} (T)\right\vert -\left\vert \operatorname {leaves} (\operatorname {prune} (T,t))\right\vert }}}

    is chosen for removal.

    The function {\displaystyle \operatorname {prune} (T,t)} defines the tree obtained by pruning the subtrees t from the tree T .

    After the grove of trees has been cultivated, the next step is to, A training set or cross-validation is used to determine which tree has the highest generalized accuracy.

    {End Chapter 1}

    Chapter 2: Decision tree learning

    Learning via the use of decision trees is a kind of supervised learning that is used in the fields of statistics, data mining, and machine learning. In this formalism, a classification or regression decision tree is used as a predictive model to derive conclusions about a collection of data. [C]lassification decision trees [R]egression decision trees [C]lassification decision trees [D]ecision trees.

    Classification trees are tree models in which the goal variable may take on a finite number of values. In these tree structures, leaves indicate class labels, and branches represent conjunctions of characteristics that lead to those class labels. Regression trees are a kind of decision tree that are used when the target variable may take on continuous values, which are commonly represented by real numbers. In a broader sense, the idea of a regression tree may be applied to any kind of object that has pairwise dissimilarities, such as categorical sequences.

    A decision tree is a useful tool for decision analysis because it may graphically and clearly reflect choices and the decision-making process. In data mining, a decision tree explains data (but the resulting classification tree can be an input for decision making).

    Data mining often makes use of a methodology known as decision tree learning. To build a model that can accurately forecast the value of a target variable given numerous input factors is the objective of this project.

    A decision tree is a straightforward form that may be used to categorize cases. Assume that all of the input features have finite discrete domains for the sake of this section, and that there is a single goal feature that is referred to as the classification. The term class refers to each individual component that makes up the overall domain of the categorization. A tree is referred to as a decision tree or a classification tree when each internal node, which is a node that is not a leaf, is labeled with an input characteristic. The arcs emanating from a node that is labeled with an input feature are either labeled with every conceivable value that may be assigned to the target feature or they go to a subordinate decision node that is labeled with a different input feature. The data set has been classified by the tree into either a specific class or into a particular probability distribution, and each leaf of the tree is labeled with either a class or a probability distribution over the classes. This indicates that the data set has been assigned to one of these two categories (which, if the decision tree is well-constructed, is skewed towards certain subsets of classes).

    In order to construct a tree, the source set, which serves as the tree's root node, must first be partitioned into subsets, which then give rise to the tree's successor offspring. The division is done according to a predetermined set of criteria that are determined by the categorization characteristics. This technique is carried out in a recursive fashion known as recursive partitioning, where it is applied to each derived subset. When the subset at a node has all the same values of the target variable, or when splitting no longer adds value to the predictions, the recursion has reached its conclusion and the process is complete. This method, which involves building decision trees from the top down (TDIDT)

    In the field of data mining, decision trees may also be defined as the mix of mathematical and computational methods that are used to assist in the description, classification, and generalization of a particular data set.

    The information is stored in records of the form:

    ({\textbf {x}},Y)=(x_{1},x_{2},x_{3},...,x_{k},Y)

    The dependent variable, Y , is the variable that we are focusing our attention on attempting to comprehend, either categorize or generalize it.

    The vector {\textbf {x}} is composed of the features, x_{1},x_{2},x_{3} etc, things are employed in performing the duty.

    There are two primary varieties of decision trees used in data mining:

    The conclusion of a classification tree analysis is the class (discrete) to which the data belongs, and the technique is named after its namesake.

    When the projected result may be regarded a real number (like the price of a property or the amount of time a patient spends in the hospital, for example), regression tree analysis is used.

    CART analysis, which stands for classification and regression tree analysis, is an umbrella phrase that may be used to refer to any of the aforementioned processes. This concept was initially established by Breiman et al. in 1984.

    Some approaches, which are sometimes referred to as ensemble methods, include the construction of many decision trees:

    Boosted trees Putting together an ensemble bit by little by teaching each new training instance to highlight the training instances that were previously mismodeled. One example that is usual is AdaBoost. These are useful for doing regression analysis. difficulties of both the type and the categorization type.

    An early approach known as bootstrap aggregated decision trees, also known as bagged decision trees, creates many decision trees by continually resampling training data with replacement, then voting on the trees to see which one produces the most accurate forecast.

    There are several varieties of bootstrap aggregating, one of which is called a random forest classifier.

    Every decision tree in a rotational forest is trained by first doing principal component analysis (PCA) on a different subset of the input characteristics. This is done in a rotational forest.

    Among the most notable decision tree algorithms are::

    ID3 (Iterative Dichotomiser 3)

    C4.5 (successor of ID3)

    CART (Classification And

    Enjoying the preview?
    Page 1 of 1