Image for post
Image for post

Exactly this time last year I had just begun my Data science journey so machine learning and programming were still quite new to me at the time. Over the past year, I had spent a lot of time and resources learning and practicing the various machine and deep learning techniques for solving problems. It also came as a shock when I received the news that my submitted model for the FORCE 2020 machine learning lithology prediction model made it to the top 10 submissions. My solution ranked 24th on the open test leaderboard. The shock came as I had lost confidence in my model just a few days to the end of the competition. This was so because scores from the LB top teams seemed like a data hack had been discovered. …


Image for post
Image for post
Source

The term linear model implies the model is specified as a linear combination of features. Based on training data, the learning process computes one weight for each feature to form a model that can estimate or predict the target value. For example, the target (dependent variable) is the amount of insurance a customer will purchase and the independent variables are age and income, a simple linear model would be the following:

>>> Estimated target = 0.2 + 5age + 0.0003income

Suppose the data consists of n observations { xi, yi }n. Each data point is represented as { xi, yi }. Linear models can be used both for regression and classification problems. Various types of linear models exist for various problems. …


Image for post
Image for post
Source: https://micoresolutions.com/top-database-management-challenges/

While a data scientist might have a variety of tools which they use for different purpose, most will that the Structured Query Language (SQL) is the one most important work tool a data scientist should have. For this reason, it is quite imperative and essential for a data scientist to be able to work with databases as the bulk of work done as a data scientist requires accessing and querying real life databases.

In this post, I will be highlighting the steps of working with databases from your Python terminal/script. We will be using a popular Python library named sqlalchemy for connecting our python script to the database. However, I will first be going into the process of creating a database locally on your system. For starters, there are different database management systems; MySQL, PostgreSQL, MongoDB etc. For this post, we will be creating a postgres database. …


Python Data types that are use for collections

Image for post
Image for post
Photo by Ulises Baga on Unsplash

Just before we move on, if you need to go through the basics of data types and variables, kindly check the previous post here: https://medium.com/dsn-ai-futa/variables-and-data-types-c427233e684b

DICTIONARIES

A dictionary is an associative container that contains the items in key/value pairs. The values of key-item data structure (unlike lists and strings that can be accessed using indexing) can be accessed using the keys. The keys are assigned to the values in the data structure and used to save them to memory. The choice of deciding between sequences like a list and mappings like a dictionary often depends on the specific situation. …


Image for post

A variable is a container used to store and hold data. It can be seen as a Python object used in storing and holding data for re-usability later on during the program. These data can come in different types which will be discussed in subsequent sections below.

Variable assignment is the process of assigning a value or data to our variable. In other words, mapping/matching our variable name to a value in memory of our interpreter. After variable assignment, the value of our variable is stored in memory of our interpreter and can now be assessed by calling it with the variable name during our program. …


Image for post
Image for post
Proper visualization enhances better hindsight

Understanding your data and the relationship present within it is just as important as any algorithm used to train your machine learning model. In fact, even the most sophisticated machine learning models will perform poorly on data that wasn’t visualized and understood properly. Reasons for this will be discussed throughout the remaining section of this article.

For beginners especially, , where to begin with data visualization can pose to be a challenge. This is so because there is no one-way to Exploratory Data Analysis (EDA) as EDA tools to be used for different projects aren’t always the same. In this lecture, we will be going through the basic operations to be carried out for data visualizations depending on the data and the different tools to use for each specific case. …


Problem Definition

‘You have been called in by CEO of company ‘X’ to use your machine learning skills to study the pattern of promotion. With this insight, he can understand the important features among available features that can be used to predict promotion eligibility’

Before proper data modelling can be done for better prediction, a good knowledge of the data needs to be done. This can be achieved effectively via proper Exploratory Data Analysis (EDA). …


Image for post
Image for post
Atoms make up everything. Objects give them functions

What is Object Oriented Programming (OOP)?

Most developers — python developers especially — will agree that early in their careers/their learning stage, the concept of Object Oriented Programming seemed strange, both it’s purpose and usefulness. A common misconception among beginners is that OOP is a special tool/design similar to function creations, used to solve a particular problem however it is not. Before getting fully getting into what OOP is, let’s talk about the type of programming paradigms we have in Python. …

About

Ibrahim Olawale

Data Scientist. Machine Learning Engineer.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store