Ausnahme gefangen: SSL certificate problem: certificate is not yet valid ๐Ÿ“Œ Training tree-based models with TensorFlow in just a few lines of code

๐Ÿ  Team IT Security News

TSecurity.de ist eine Online-Plattform, die sich auf die Bereitstellung von Informationen,alle 15 Minuten neuste Nachrichten, Bildungsressourcen und Dienstleistungen rund um das Thema IT-Sicherheit spezialisiert hat.
Ob es sich um aktuelle Nachrichten, Fachartikel, Blogbeitrรคge, Webinare, Tutorials, oder Tipps & Tricks handelt, TSecurity.de bietet seinen Nutzern einen umfassenden รœberblick รผber die wichtigsten Aspekte der IT-Sicherheit in einer sich stรคndig verรคndernden digitalen Welt.

16.12.2023 - TIP: Wer den Cookie Consent Banner akzeptiert, kann z.B. von Englisch nach Deutsch รผbersetzen, erst Englisch auswรคhlen dann wieder Deutsch!

Google Android Playstore Download Button fรผr Team IT Security



๐Ÿ“š Training tree-based models with TensorFlow in just a few lines of code


๐Ÿ’ก Newskategorie: AI Videos
๐Ÿ”— Quelle: blog.tensorflow.org

A guest post by Dinko Franceschi, Broad Institute of MIT and Harvard

Kaggle has become the go-to place to practice data science skills and participate in machine learning model-building competitions. This tutorial will provide an easy-to-follow walkthrough of how to get started with a Kaggle notebook using TensorFlow Decision Forests. Itโ€™s a library that allows you to train tree-based models (like random forests and gradient-boosted trees) in TensorFlow.

Why should you be interested in decision forests? There are roughly two types of Kaggle competitions - and the winning solution (neural networks or decision forests) depends on the kind of data youโ€™re working with.

If youโ€™re working with a tabular data problem (these involve training a model to classify data in a spreadsheet which is an extremely common scenario) - the winning solution is often a decision forest. However, if youโ€™re working with a perception problem that involves teaching a computer to see or hear (for example, image classification), the winning model is usually a neural network.

Hereโ€™s where the good news starts. You can implement a decision forest in TensorFlow with just a few lines of code. This relatively simple model often outperforms a neural network on many Kaggle problems.

We will explore the decision forests library with a simple dataset from Kaggle, and we will build our model with Kaggle Kernels which allow you to completely build and train your models online using free cloud compute power - similar to Colab. The dataset contains vehicle information such as cost, number of doors, occupancy, and maintenance costs which we will use to assign an evaluation on the car.

Kaggle Kernels can be accessed through your Kaggle account. If you do not have an account, please begin by signing up. On the home page, select the โ€œCodeโ€ option on the left menu and select โ€œNew Notebook,โ€ which will open a new Kaggle Kernel.



Once we have opened a new notebook from Kaggle Kernels, we download the car evaluation dataset to our environment. Click โ€œAdd dataโ€ near the top right corner of your notebook, search for โ€œcar evaluation,โ€ and add the dataset.


Now we are ready to start writing code. Install the TensorFlow Decision Forests library and the necessary imports, as shown below. The code in this blog post has been obtained from the Build, train and evaluate models with the TensorFlow Decision Forests tutorial which contains additional examples to look at.

!pip install tensorflow_decision_forests

import numpy as np

import pandas

import tensorflow_decision_forests as tfdf

We will now import the dataset. We should note that the dataset we downloaded did not contain headers, so we will add those first based on the information provided on the Kaggle page for the dataset. It is good practice to inspect your dataset before you start working with it by opening it up in your favorite text or spreadsheet editor.

df = pandas.read_csv("../input/car-evaluation-data-set/car_evaluation.csv")

col_names =['buying price', 'maintenance price', 'doors', 'persons', 'lug_boot', 'safety', 'class']

df.columns = col_names

df.head()

We must then split the dataset into train and test:

def split_dataset(dataset, test_ratio=0.30):

test_indices = np.random.rand(len(dataset)) < test_ratio

return dataset[~test_indices], dataset[test_indices]


train_ds_pd, test_ds_pd = split_dataset(df)

print("{} examples in training, {} examples for testing.".format(

len(train_ds_pd), len(test_ds_pd)))

And finally we will convert the dataset into tf.data format. This is a high-performance format that is used by TensorFlow to train models more efficiently, and with TensorFlow Decision Forests, you can convert your dataset to this format with one line of code:


train_ds = tfdf.keras.pd_dataframe_to_tf_dataset(train_ds_pd, label="class")

test_ds = tfdf.keras.pd_dataframe_to_tf_dataset(test_ds_pd, label="class")

Now you can go ahead and train your model right away by executing the following:

model = tfdf.keras.RandomForestModel()

model.fit(train_ds)

The library has good defaults which are a fine place to start for most problems. For advanced users, there are lots of options to choose from in the API doc as random forests are configurable.

Once you have trained the model, you can see how it will perform on the test data.

model.compile(metrics=["accuracy"])

print(model.evaluate(test_ds))

In just a few lines of code, you reached an accuracy of >95% on this small dataset! This is a simple dataset, and one might argue that neural networks could also yield impressive results. And they absolutely can (and do), especially when you have very large datasets (think: hundreds of thousands of examples, or more). However, neural networks require more code and are resource intensive as they require significantly more compute power.

Easy preprocessing

Decision forests have another important advantage: there are fewer steps to preprocess the data. Notice in the code above that you were able to pass a dataset with both categorical and numeric values directly to the decision forests. You did not have to do any preprocessing like normalizing numeric values, converting strings to integers, and one-hot encoding them. This has major benefits. It makes decision forests simpler to work with (so you can train a model quickly), and there is less code that can go wrong.

Below, you will see some important differences between the two techniques.

Easy to interpret

A significant advantage of decision forests is that they are easy to interpret. While the pipeline for decision trees differs significantly from that of training neural networks, there are major advantages for selecting these models for a given task. This is because feature importance is particularly straightforward to determine with decision forests (ensemble of decision trees). Notably, the TensorFlow Decision Forests library makes it possible to visualize feature importance with its model plotter function. Letโ€™s see below how this works!

tfdf.model_plotter.plot_model_in_colab(model, tree_idx=0)

We see in the root of the tree on the left the number of examples (1728) and the corresponding distribution indicated by the different colors. Here our model is looking at the number of persons that the car can fit. The largest section indicated by green stands for 2 persons and the red for 4 persons. Furthermore, as we go down the tree we continue to see how the tree splits and the corresponding number of examples. Based on the condition, examples are branched to one of two paths. Interestingly, from here we can also determine the importance of a feature by examining all of the splits of a given feature and then computing how much this feature lowered the variance.

Decision Trees vs. Neural Networks

Neural networks undoubtedly have incredible representation learning capabilities. While they are very powerful in this regard, it is important to consider whether they are the right tool for the problem at hand. When working with neural networks, one must think a lot about how they will construct the layers. In contrast, decision forests are ready to go out of the box (of course, advanced users can tune a variety of parameters).

Prior to even building a neural network layer by layer, in most cases one must perform feature pre-processing. For example, this could include normalizing the features to have mean around 0 and standard deviation of 1 and converting strings to numbers. This initial step can be skipped right away with Tree-based models which natively handle mixed data.

As seen in the code above, we were able to obtain results in just a few steps. Once we have our desired metrics, we have to interpret them within the context of our problem. Perhaps one of the most significant strengths of Decision Trees is their interpretability. We see in the code above the diagrams that were outputted. Starting at the root, we can follow the branches and quickly get a good idea of how the model made its decisions. In contrast, neural networks are a โ€œblack boxโ€ that can be difficult to interpret and to explain to a non-technical audience.

Learning more

If youโ€™d like to learn more about TensorFlow Decision Forests, the best place to start is with the project homepage. You can also check out this previous article for more background. And if you have any questions or feedback, the best place to ask them is on https://discuss.tensorflow.org/ using the tag โ€œtfdfโ€. Thanks for reading!

...



๐Ÿ“Œ Creating a killer botnet using just a few lines of code


๐Ÿ“ˆ 36.68 Punkte

๐Ÿ“Œ Meet RAGatouille: A Machine Learning Library to Train and Use SOTA Retrieval Model, ColBERT, in Just a Few Lines of Code


๐Ÿ“ˆ 36.68 Punkte

๐Ÿ“Œ Constructing a Decision Tree Classifier: A Comprehensive Guide to Building Decision Tree Modelsโ€ฆ


๐Ÿ“ˆ 36.12 Punkte

๐Ÿ“Œ Reviewing 10 lines of code vs. 500 lines of code


๐Ÿ“ˆ 36.02 Punkte

๐Ÿ“Œ Tree-Planting Schemes Are Just Creating Tree Cemeteries


๐Ÿ“ˆ 33.7 Punkte

๐Ÿ“Œ Build a Free and Easy Prompter(like ChatGPT) with Hugging Face Models in Just 8 Lines of Code ๐Ÿš€๐Ÿ’ฌ๐Ÿค–


๐Ÿ“ˆ 33.29 Punkte

๐Ÿ“Œ Let's Tree-Shake It... (aka Demystifying Tree-Shaking and Dead Code Elimination)


๐Ÿ“ˆ 31.15 Punkte

๐Ÿ“Œ Evaluating TensorFlow models with TensorFlow Model Analysis


๐Ÿ“ˆ 30.83 Punkte

๐Ÿ“Œ Train TensorFlow models at cloud scale with TensorFlow Cloud | Demo


๐Ÿ“ˆ 30.83 Punkte

๐Ÿ“Œ Automated Deployment of TensorFlow Models with TensorFlow Serving and GitHub Actions


๐Ÿ“ˆ 30.83 Punkte

๐Ÿ“Œ How to empty your bank's vault with a few clicks and lines of code


๐Ÿ“ˆ 30.25 Punkte

๐Ÿ“Œ How to empty your bank's vault with a few clicks and lines of code


๐Ÿ“ˆ 30.25 Punkte

๐Ÿ“Œ 6/7 Adding text and text effects to your WebVR scene with a few lines of code


๐Ÿ“ˆ 30.25 Punkte

๐Ÿ“Œ How I made Infinite Craft a multiplayer game with a few lines of code


๐Ÿ“ˆ 30.25 Punkte

๐Ÿ“Œ Download YouTube subtitles with a few lines of code


๐Ÿ“ˆ 30.25 Punkte

๐Ÿ“Œ Large Language Models, GPT-3: Language Models are Few-Shot Learners


๐Ÿ“ˆ 29.95 Punkte

๐Ÿ“Œ Red Lines Tools 1.4 - Add on-screen lines, grid, layout, and various image overlays.


๐Ÿ“ˆ 28.25 Punkte

๐Ÿ“Œ Medium CVE-2020-7682: Marked-tree project Marked-tree


๐Ÿ“ˆ 27.27 Punkte

๐Ÿ“Œ PE Tree - Python Module For Viewing Portable Executable (PE) Files In A Tree-View


๐Ÿ“ˆ 27.27 Punkte

๐Ÿ“Œ CTO (Call Tree Overviewer): yet another function call tree viewer


๐Ÿ“ˆ 27.27 Punkte

๐Ÿ“Œ Difference between Binary Search Tree and AVL Tree


๐Ÿ“ˆ 27.27 Punkte

๐Ÿ“Œ Abstract Syntax Tree vs Parse Tree


๐Ÿ“ˆ 27.27 Punkte

๐Ÿ“Œ Christmas Tree 1.1.2 - A little Christmas tree, counting down the days to Christmas on your desktop.


๐Ÿ“ˆ 27.27 Punkte

๐Ÿ“Œ If You Bought a Real Christmas Tree, You Paid 15 Cents To the Christmas Tree Promotion Board


๐Ÿ“ˆ 27.27 Punkte

๐Ÿ“Œ [623. Add One Row to Tree](https://leetcode.com/problems/add-one-row-to-tree/)


๐Ÿ“ˆ 27.27 Punkte

๐Ÿ“Œ Deploy and fine-tune foundation models in Amazon SageMaker JumpStart with two lines of code


๐Ÿ“ˆ 26.86 Punkte

๐Ÿ“Œ Open AI Proposes Consistency Models: A New Family of Generative Models That Achieve High Sample Quality Without Adversarial Training


๐Ÿ“ˆ 26.84 Punkte

๐Ÿ“Œ Recognizing speech with a few lines of Python | COM207


๐Ÿ“ˆ 26.37 Punkte

๐Ÿ“Œ A few questions on adding rust on the Linux source tree.


๐Ÿ“ˆ 25.88 Punkte











matomo