Data Science

Image for post
Image for post
Image by author

I have met the confusion matrix during my first year of a Data Science master's degree. The first time the professor had explained it, I couldn’t feel anything, but CONFUSION! For this reason, I want to explain in simple words the concepts beyond this matrix, that will be your partner every time you need to evaluate the performance of the model.

So, what is a Confusion Matrix? Why do we need it? Generally, It’s a tool that helps to understand if the model is working really well. Moreover, from it, you can derive many evaluation measures, such as accuracy, precision…


An overview of Cross Validation techniques using sklearn

Image for post
Image for post
Image made by author

The machine learning methods often fail to model the data because they learn particular features of the training set, that are not present in the test set. So these features are not representative and we are in a situation of Overfitting. It happens when the model fits too much of the training data, but it’s not able to generalize in new samples. There are many ways to address this issue, regularization, selection of the best hyperparameters, and K fold cross validation. In this article, I focus on this last topic because I think it has a relevant role to have…


Machine Learning

An overview of the basic concepts to design Pytorch Neural Networks

Image for post
Image for post
Image made by author

Pytorch is a Deep Learning framework developed by Facebook’s AI Research Lab in 2016. It’s widely famous for computer vision oriented applications. Moreover, It’s characterized by simplicity, strong GPU support, and implemented deep learning algorithms. For these features, it’s also one of the most used libraries in academic research.

There are a lot of Pytorch tutorials on the web and a lot of documentation on the Pytorch website. But too much information can be confusing and make lose a lot of time. My goal is to show an overview of the basic functions and classes available in Pytorch along with…


Natural Language Processing

A simple guide to build a speech recognizer using Google’s API

Image for post
Image for post
Figure 1: Photo on Pixabay

Speech recognition is an interesting task that allows you to improve the quality of your life. In this neverending Covid period, I need to watch many videos of lessons, and it’s so easy to lose concentration. At the same time, the possibility to have all registrations available on my university’s website made me become a perfectionist, so I would like to take every word in my notes. But it’s costly because it needs a lot of work and steals time.

Luckily, there are already API resources available such as Google, Amazon, IBM, and many others, that offer services that convert…


Statistics, R

Balance interpretability and predictive power through non-linear models

Image for post
Image for post
Photo by Dan Freeman on Unsplash

The Generalized Additive Models are extensions of the linear models that allow modeling nonlinear relationships in a flexible way. Moreover, GAMs are a middle way between simple models such as linear regression and more complex models like gradient boosting.

Linear models are easy to interpret, used for inference and allow to understand the linear relationship between the predictor and response, but can suffer from high bias. …


Tutorial

Configure Ubuntu shell as terminal in Pycharm environment

Image for post
Image for post
Image made by Author

If you have windows 10 and you need to install Ubuntu for study or work purposes, you are in the right place. Have you tried to install Ubuntu VirtualBox in the past? If you had, probably your computer became very slow, it needed too many efforts to run it and you lost a lot of time.

Forget about it. There is a way to not fill all your computer’s memory and it’s very fast to use. You only need to install the Ubuntu shell from Microsoft Store and it’s done! Without any complicated steps.

The Ubuntu Shell can be installed…


Programming

Image for post
Image for post
Photo by Rodolfo Barretto on Unsplash

Working with dates and times in Python is not automatic while analyzing the datasets for the first time. There are a lot of features to take into account, such as the year, the month, the day, the hour, the minutes, the seconds, but also more complex features as the duration, the weekday, the timezones. For this reason, I will talk about a Python module, that manipulates this type of information: datetime [1].

The datasets often have the dates represented as strings and you need to convert them to datetime format in order to work with time series data. …


Deep Learning

Configure a Conda environment in Pycharm to enable the use of CUDA

Image for post
Image for post
Image made by Author

Pytorch is a Python package that is used to develop deep learning models with maximum flexibility and speed. Pytorch is characterized by Tensors, which are essentially n-dimensional arrays and are used for matrix computations. So, it’s similar to a NumPy array. The advantage of using Pytorch Tensor instead of a Numpy array is that a PyTorch Tensor can run on GPU [1].

The Pytorch installation is not so hard itself, but the steps to enable GPU on the local machine are not banal. …


Programming

How to use Python and Twitter API to create your own Twitter dataset

Image for post
Image for post
Figure 1: photo by JillWellington on pixabay

Social networks are constantly part of our life nowadays. Their popularity can be explained by accessibility and convenience, which allow users to provide huge amounts of information with limited or no restrictions on content. This continuous and rich mass of data is made available by these platforms with the purpose of studying sentiments about brands, products, events, recent news, social and political issues.

In this covid-19 period, there has been a dramatic growth on these platforms. In Twitter, there has been an increased use of the platform for misinformation related to the pandemic. For this reason, I am going to…


A library that combines the flexibility of scikit-learn and the power of PyTorch

Image for post
Image for post
Image made by Author

The goal of this post is to use a tool to train and evaluate a Pytorch’s model in a simple way. This tool is Skorch, that is a scikit-learn compatible neural network library that wraps Pytorch. So it makes possible to use Pytorch with sklearn. Moreover, it takes advantage of Scikit-learn’s functions such as fit, predict and GridSearch [1]. This tool is applied on MNIST, a dataset composed by images of handwritten digits: 60,000 for training and 10,000 for testing. …

Eugenia Anello

I am a Data Science student and a Traveller enthusiast | I learn something new everyday | https://www.linkedin.com/in/eugenia-anello-545711146

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store