Isolation Forest is one of the most used techniques to detect anomalies in the data. It’s based on a “forest” of trees, where each isolation tree isolates anomalous observations from the rest of the data points. Despite its simplicity, speed and intuitiveness, there is a drawback. The lack of explanation. Why is a particular observation considered anomalous by the algorithm? How can the output be interpreted?

There are two possible interpretations, Global and Local. **Global** because the goal is to explain the model as a whole and understand what features have a more relevant role in the general model. On…

I met this graphical representation for the first time at beginning of my statistics degree. At first impact, I liked it. I found it so simple and full of information. But at the same time, you can take for granted some knowledge, that are essential to understand this plot. For this reason, I am writing this tutorial that will focus on all the details that can escape your attention.

You are probably asking why you should use it. The first thing you should know for now is that it’s an efficient way to display all the characteristics of the data…

*The post is the *fifth* in a series of guides to build deep learning models with Pytorch. Below, there is the full series:*

- Pytorch Tutorial for Beginners
- Understand Tensor Dimensions in DL models
- CNN & Feature visualizations
- Hyperparameter tuning with Optuna
- K Fold Cross Validation (this post)
- Convolutional Autoencoder
- Denoising Autoencoder
- Variational Autoencoder

The goal of the series is to make Pytorch more intuitive and accessible as possible through examples of implementations. There are many tutorials on the Internet to use Pytorch to build many types of challenging models, but it can also be confusing at the same time because…

*This is the fourth post of the **NLP tutorial series**. This guide will let you understand how to improve the word cloud using TF-IDF representation.*

Have you heard about the Word Cloud? It’s an amazing visualization of text data that has the intent to capture the keywords. The more a word appears, the more that word will be bigger in the representation.

This is the standard Word Cloud that you can find everywhere. But is it really useful? Can counting the occurrences of words help in understanding their importance? There are surely some words repeated that can be considered as…

*This is the third post of the **NLP tutorial series**. This guide will let you understand step by step how to implement TF-IDF from scratch and compare the results obtained with the already implemented Scikit-learn’s *TfidfVectorizer*.*

The models that deal with huge amounts of text to perform classification, speech recognition, or translation need an additional step to process these types of data. The text data needs to be transformed into something else, numbers, which can be understood by computers. There are many techniques available to create new features using such data.

One of them is **Term Frequency-Inverse Document Frequency**, also…

Welcome to my NLP tutorial Series! I have written a series of tutorials to introduce you to the world of Natural Language Processing. This dynamic field is difficult to summarize in a unique post, so I had the idea to collect all I learnt in more posts.

- Text pre-processing in Python
- Bag-of-Words with an Example in Python
- TF-IDF with an Example in Python
- Word Clouds with TF-IDF

For now, these are the posts that will be published soon. Other posts of the series will be added within the time.

In most of my master of Data Science, the graphs were underestimated and left in a corner. On the other hand, I began to understand the importance of data visualizations during my internship. At first, they can appear really boring, but at the same time, they allow you to reach knowledge you would never know looking only at the raw data. It’s the first thing you need to check before modifying your dataset and applying ML algorithms. And it will accompany you during all the search for truth.

In my previous article, I introduced a library to produce interactive visualizations…

When you begin to write, it’s amazing how many discoveries you can do during this amazing path.

I remember that I didn’t know anything about the world I was entering after I wrote my first post on Medium in October. In the meanwhile, I also learnt small aspects that contribute increasing your article’s performance.

Below, I show 7 ways to improve the effectiveness and the quality of your posts:

There are two possible choices you can do when you have written an article.

- you can publish it on your profile.
- submit the draft of your story in a publication

The…

*The post is the fourth in a series of guides to building deep learning models with Pytorch. Below, there is the full series:*

- Pytorch Tutorial for Beginners
- Understand Tensor Dimensions in DL models
- CNN & Feature visualizations
- Hyperparameter tuning with Optuna (this post)
- K Fold Cross-Validation
- Convolutional Autoencoder
- Denoising Autoencoder
- Variational Autoencoder

The goal of the series is to make Pytorch more intuitive and accessible as possible through examples of implementations. There are many tutorials on the Internet to use Pytorch to build many types of challenging models, but it can also be confusing at the same time because there…

- Introduction to time series analysis
- pandas time series indexing
- Visualizations with plotly express
- pandas functions, such as
`resample`

,`rolling`

, and`shift`

If you work with time series data in Python, you should know some tricks to avoid operations that are computationally expensive. Luckily, pandas is a library with a lot of useful functions to apply. I think it’s important to have an overview of what is possible to do with time series before making more complex analyses, like feature extraction and predictions.

In this article, I will show one of the most powerful advantages of pandas time series is the…

Data Science student| Top 1500 writer on Medium | I like to share the concepts I learn everyday| https://www.linkedin.com/in/eugenia-anello-545711146