aster.cloud aster.cloud
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
aster.cloud aster.cloud
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
  • Data
  • Engineering
  • Practices
  • Programming
  • Technology

Data Cleaning In Python: The Ultimate Guide

  • Aelia Vita
  • December 2, 2021
  • 11 minute read

Techniques on what to clean and how.

Before fitting a machine learning or statistical model, we always have to clean the data. No models create meaningful results with messy data.


Partner with aster.cloud
for your next big idea.
Let us know here.



From our partners:

CITI.IO :: Business. Institutions. Society. Global Political Economy.
CYBERPOGO.COM :: For the Arts, Sciences, and Technology.
DADAHACKS.COM :: Parenting For The Rest Of Us.
ZEDISTA.COM :: Entertainment. Sports. Culture. Escape.
TAKUMAKU.COM :: For The Hearth And Home.
ASTER.CLOUD :: From The Cloud And Beyond.
LIWAIWAI.COM :: Intelligence, Inside and Outside.
GLOBALCLOUDPLATFORMS.COM :: For The World's Computing Needs.
FIREGULAMAN.COM :: For The Fire In The Belly Of The Coder.
ASTERCASTER.COM :: Supra Astra. Beyond The Stars.
BARTDAY.COM :: Prosperity For Everyone.

Data cleaning or cleansing is the process of detecting and correcting (or removing) corrupt or inaccurate records from a record set, table, or database and refers to identifying incomplete, incorrect, inaccurate or irrelevant parts of the data and then replacing, modifying, or deleting the dirty or coarse data.

What a long definition! It is certainly not fun and very time-consuming.

machine learning class - send you back to data cleaning
Source: kdnuggets.com

To make it easier, we created this new complete step-by-step guide in Python. You’ll learn techniques on how to find and clean:

  • Missing Data
  • Irregular Data (Outliers)
  • Unnecessary Data — Repetitive Data, Duplicates and more
  • Inconsistent Data — Capitalization, Addresses and more

Within this guide, we use the Russian housing dataset from Kaggle. The goal of this project is to predict housing price fluctuations in Russia. We are not cleaning the entire dataset but will show examples from it.

Before we jump into the cleaning process, let’s take a brief look at the data.

# import packages
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import matplotlib.mlab as mlab
import matplotlib
plt.style.use(‘ggplot’)
from matplotlib.pyplot import figure
%matplotlib inline
matplotlib.rcParams[‘figure.figsize’] = (12,8)
pd.options.mode.chained_assignment = None
# read the data
df = pd.read_csv(‘sberbank.csv’)
# shape and data types of the data
print(df.shape)
print(df.dtypes)
# select numeric columns
df_numeric = df.select_dtypes(include=[np.number])
numeric_cols = df_numeric.columns.values
print(numeric_cols)
# select non numeric columns
df_non_numeric = df.select_dtypes(exclude=[np.number])
non_numeric_cols = df_non_numeric.columns.values
print(non_numeric_cols)
view rawread_explore_data.py hosted with ❤ by GitHub

From these results, we learn that the dataset has 30,471 rows and 292 columns. We also identify whether the features are numeric or categorical variables. These are all useful information.

Now we can run through the checklist of “dirty” data types and fix them one by one.

Let’s get started.

Source: GIPHY

Missing data

Dealing with missing data/value is one of the most tricky but common parts of data cleaning.While many models can live with other problems of the data, most models don’t accept missing data.

How to find out?

We cover three techniques to learn more about missing data in our dataset.

  • Technique #1: Missing Data Heatmap

When there is a smaller number of features, we can visualize the missing data via heatmap.

cols = df.columns[:30] # first 30 columns
colours = [‘#000099’, ‘#ffff00’] # specify the colours – yellow is missing. blue is not missing.
sns.heatmap(df[cols].isnull(), cmap=sns.color_palette(colours))
view rawmissing_data_find1.py hosted with ❤ by GitHub

The chart below demonstrates the missing data patterns of the first 30 features. The horizontal axis shows the feature name; the vertical axis shows the number of observations/rows; the yellow color represents the missing data while the blue color otherwise.

For example, we see that the life_sq feature has missing values throughout many rows. While the floor feature only has little missing values around the 7000th row.

missing data heatmap
Missing Data Heatmap
  • Technique #2: Missing Data Percentage List

When there are many features in the dataset, we can make a list of missing data % for each feature.

# if it’s a larger dataset and the visualization takes too long can do this.
# % of missing.
for col in df.columns:
pct_missing = np.mean(df[col].isnull())
print(‘{} – {}%’.format(col, round(pct_missing*100)))
view rawmissing_data_find2.py hosted with ❤ by GitHub

This produces a list below showing the percentage of missing values for each of the features.

Specifically, we see that the life_sq feature has 21% missing, while floor has only 1% missing. This list is a useful summary that can complement the heatmap visualization.

missing data percentage list
Missing Data % List — the first 30 features
  • Technique #3: Missing Data Histogram

Missing data histogram is also a technique for when we have many features.

To learn more about the missing value patterns among observations, we can visualize it by a histogram.

# first create missing indicator for features with missing data
for col in df.columns:
missing = df[col].isnull()
num_missing = np.sum(missing)
if num_missing > 0:
print(‘created missing indicator for: {}’.format(col))
df[‘{}_ismissing’.format(col)] = missing
# then based on the indicator, plot the histogram of missing values
ismissing_cols = [col for col in df.columns if ‘ismissing’ in col]
df[‘num_missing’] = df[ismissing_cols].sum(axis=1)
df[‘num_missing’].value_counts().reset_index().sort_values(by=‘index’).plot.bar(x=‘index’, y=‘num_missing’)
view rawmissing_data_dropping1.py hosted with ❤ by GitHub

This histogram helps to identify the missing values situations among the 30,471 observations.

For example, there are over 6000 observations with no missing values and close to 4000 observations with one missing value.

missing data histogram
Missing Data Histogram

What to do?

There are NO agreed-upon solutions to dealing with missing data. We have to study the specific feature and dataset to decide the best way of handling them.

Below covers the four most common methods of handling missing data. But, if the situation is more complicated than usual, we need to be creative to use more sophisticated methods such as missing data modeling.

  • Solution #1: Drop the Observation
Read More  Google Cloud Next 2019 | Increasing Development Velocity With App Engine and Google Cloud

In statistics, this method is called the listwise deletion technique. In this solution, we drop the entire observation as long as it contains a missing value.

Only if we are sure that the missing data is not informative, we perform this. Otherwise, we should consider other solutions.

There could be other criteria to use to drop the observations.

For example, from the missing data histogram, we notice that only a minimal amount of observations have over 35 features missing altogether. We may create a new dataset df_less_missing_rows deleting observations with over 35 missing features.

# drop rows with a lot of missing values.
ind_missing = df[df[‘num_missing’] > 35].index
df_less_missing_rows = df.drop(ind_missing, axis=0)
view rawmissing_data_dropping2.py hosted with ❤ by GitHub
  • Solution #2: Drop the Feature

Similar to Solution #1, we only do this when we are confident that this feature doesn’t provide useful information.

For example, from the missing data % list, we notice that hospital_beds_raion has a high missing value percentage of 47%. We may drop the entire feature.

# hospital_beds_raion has a lot of missing.
# If we want to drop.
cols_to_drop = [‘hospital_beds_raion’]
df_less_hos_beds_raion = df.drop(cols_to_drop, axis=1)
view rawmissing_data_dropping3.py hosted with ❤ by GitHub
  • Solution #3: Impute the Missing

When the feature is a numeric variable, we can conduct missing data imputation. We replace the missing values with the average or median value from the data of the same feature that is not missing.

When the feature is a categorical variable, we may impute the missing data by the mode (the most frequent value).

Using life_sq as an example, we can replace the missing values of this feature by its median.

# replace missing values with the median.
med = df[‘life_sq’].median()
print(med)
df[‘life_sq’] = df[‘life_sq’].fillna(med)
view rawmissing_data_imputation1.py hosted with ❤ by GitHub

Moreover, we can apply the same imputation strategy for all the numeric features at once.

# impute the missing values and create the missing value indicator variables for each numeric column.
df_numeric = df.select_dtypes(include=[np.number])
numeric_cols = df_numeric.columns.values
for col in numeric_cols:
missing = df[col].isnull()
num_missing = np.sum(missing)
if num_missing > 0: # only do the imputation for the columns that have missing values.
print(‘imputing missing values for: {}’.format(col))
df[‘{}_ismissing’.format(col)] = missing
med = df[col].median()
df[col] = df[col].fillna(med)
view rawmissing_data_imputation2.py hosted with ❤ by GitHub
imputing missing values

Luckily, our dataset has no missing value for categorical features. Yet, we can apply the mode imputation strategy for all the categorical features at once.

# impute the missing values and create the missing value indicator variables for each non-numeric column.
df_non_numeric = df.select_dtypes(exclude=[np.number])
non_numeric_cols = df_non_numeric.columns.values
for col in non_numeric_cols:
missing = df[col].isnull()
num_missing = np.sum(missing)
if num_missing > 0: # only do the imputation for the columns that have missing values.
print(‘imputing missing values for: {}’.format(col))
df[‘{}_ismissing’.format(col)] = missing
top = df[col].describe()[‘top’] # impute with the most frequent value.
df[col] = df[col].fillna(top)
view rawmissing_data_imputation3.py hosted with ❤ by GitHub
  • Solution #4: Replace the Missing

For categorical features, we can add a new category with a value such as “_MISSING_”. For numerical features, we can replace it with a particular value such as -999.

This way, we are still keeping the missing values as valuable information.

# categorical
df[‘sub_area’] = df[‘sub_area’].fillna(‘_MISSING_’)
# numeric
df[‘life_sq’] = df[‘life_sq’].fillna(–999)
view rawmissing_data_replace.py hosted with ❤ by GitHub

Irregular data (Outliers)

Outliers are data that is distinctively different from other observations. They could be real outliers or mistakes.

How to find out?

Depending on whether the feature is numeric or categorical, we can use different techniques to study its distribution to detect outliers.

  • Technique #1: Histogram/Box Plot

When the feature is numeric, we can use a histogram and box plot to detect outliers.

Below is the histogram of feature life_sq.

# histogram of life_sq.
df[‘life_sq’].hist(bins=100)
view rawoutlier_histogram.py hosted with ❤ by GitHub

The data looks highly skewed with the possible existence of outliers.

histogram outliers
Histogram

To study the feature closer, let’s make a box plot.

# box plot.
df.boxplot(column=[‘life_sq’])
view rawoutlier_boxplot.py hosted with ❤ by GitHub

In this plot, we can see there is an outlier at a value of over 7000.

box plot outliers
Box Plot
  • Technique #2: Descriptive Statistics

Also, for numeric features, the outliers could be too distinct that the box plot can’t visualize them. Instead, we can look at their descriptive statistics.

For example, for the feature life_sq again, we can see that the maximum value is 7478, while the 75% quartile is only 43. The 7478 value is an outlier.

df[‘life_sq’].describe()
view rawoutlier_describe.py hosted with ❤ by GitHub
outliers descriptive statistics
  • Technique #3: Bar Chart

When the feature is categorical. We can use a bar chart to learn about its categories and distribution.

For example, the feature ecology has a reasonable distribution. But if there is a category with only one value called “other”, then that would be an outlier.

Read More  Web Environment Integrity Is An All-Out Attack On The Free Internet
# bar chart -  distribution of a categorical variable
df[‘ecology’].value_counts().plot.bar()
view rawoutlier_barchart.py hosted with ❤ by GitHub
bar chart
Bar Chart
  • Other Techniques: Many other techniques can spot outliers as well, such as scatter plot, z-score, and clustering. This article does not cover all of those.

What to do?

While outliers are not hard to detect, we have to determine the right solutions to handle them. It highly depends on the dataset and the goal of the project.

The methods of handling outliers are somewhat similar to missing data. We either drop or adjust or keep them. We can refer back to the missing data section for possible solutions.


Unnecessary data

After all the hard work done for missing data and outliers, let’s look at unnecessary data, which is more straightforward.

All the data feeding into the model should serve the purpose of the project. The unnecessary data is when the data doesn’t add value. We cover three main types of unnecessary data due to different reasons.

Unnecessary type #1: Uninformative / Repetitive

Sometimes one feature is uninformative because it has too many rows being the same value.

How to find out?

We can create a list of features with a high percentage of the same value.

For example, we specify below to show features with over 95% rows being the same value.

num_rows = len(df.index)
low_information_cols = [] #
for col in df.columns:
cnts = df[col].value_counts(dropna=False)
top_pct = (cnts/num_rows).iloc[0]
if top_pct > 0.95:
low_information_cols.append(col)
print(‘{0}: {1:.5f}%’.format(col, top_pct*100))
print(cnts)
print()
view rawirrelevant_data.py hosted with ❤ by GitHub

We can look into these variables one by one to see whether they are informative or not. We won’t show the details here.

repetitive data

What to do?

We need to understand the reasons behind the repetitive feature. When they are genuinely uninformative, we can toss them out.

Unnecessary type #2: Irrelevant

Again, the data needs to provide valuable information for the project. If the features are not related to the question we are trying to solve in the project, they are irrelevant.

How to find out?

We need to skim through the features to identify irrelevant ones.

For example, a feature recording the temperature in Toronto doesn’t provide any useful insights to predict Russian housing prices.

What to do?

When the features are not serving the project’s goal, we can remove them.

READ MORE: [button style=’accent’ url=’https://aster.cloud/2021/09/08/beautiful-is-better-than-ugly-using-python-to-automate-your-seo-strategy/’ target=’_blank’ arrow=’true’ fullwidth=’true’]BEAUTIFUL IS BETTER THAN UGLY’: USING PYTHON TO AUTOMATE YOUR SEO STRATEGY[/button]

[button style=’accent’ url=’https://aster.cloud/2021/11/01/why-hackers-should-learn-python-for-pen-testing/’ target=’_blank’ arrow=’true’ fullwidth=’true’]WHY HACKERS SHOULD LEARN PYTHON FOR PEN TESTING[/button]

Unnecessary type #3: Duplicates

The duplicate data is when copies of the same observation exist.

There are two main types of duplicate data.

  • Duplicates type #1: All Features based

How to find out?

This duplicate happens when all the features’ values within the observations are the same. It is easy to find.

We first remove the unique identifier id in the dataset. Then we create a dataset called df_dedupped by dropping the duplicates. We compare the shapes of the two datasets (df and df_dedupped) to find out the number of duplicated rows.

# we know that column ‘id’ is unique, but what if we drop it?
df_dedupped = df.drop(‘id’, axis=1).drop_duplicates()
# there were duplicate rows
print(df.shape)
print(df_dedupped.shape)
view rawduplicate_data_rows.py hosted with ❤ by GitHub

10 rows are being complete duplicate observations.

What to do?

We should remove these duplicates, which we already did.

  • Duplicates type #2: Key Features based

How to find out?

Sometimes it is better to remove duplicate data based on a set of unique identifiers.

For example, the chances of two transactions happening at the same time, with the same square footage, the same price, and the same build year are close to zero.

We can set up a group of critical features as unique identifiers for transactions. We include timestamp, full_sq, life_sq, floor, build_year, num_room, price_doc. We check if there are duplicates based on them.

key = [‘timestamp’, ‘full_sq’, ‘life_sq’, ‘floor’, ‘build_year’, ‘num_room’, ‘price_doc’]
df.fillna(–999).groupby(key)[‘id’].count().sort_values(ascending=False).head(20)
view rawduplicate_data_key_check.py hosted with ❤ by GitHub

There are 16 duplicates based on this set of key features.

duplicates data

What to do?

We can drop these duplicates based on the key features.

# drop duplicates based on an subset of variables.
key = [‘timestamp’, ‘full_sq’, ‘life_sq’, ‘floor’, ‘build_year’, ‘num_room’, ‘price_doc’]
df_dedupped2 = df.drop_duplicates(subset=key)
print(df.shape)
print(df_dedupped2.shape)
view rawduplicate_data_drop.py hosted with ❤ by GitHub

We dropped the 16 duplicates within the new dataset named df_dedupped2.

Inconsistent data

It is also crucial to have the dataset follow specific standards to fit a model. We need to explore the data in different ways to find out the inconsistent data. Much of the time, it depends on observations and experience. There is no set code to run and fix them all.

Below we cover four inconsistent data types.

Inconsistent type #1: Capitalization

Inconsistent usage of upper and lower cases in categorical values is a common mistake. It could cause issues since analyses in Python is case sensitive.

Read More  Google I/O 2019 | What’s New with Chrome and the Web

How to find out?

Let’s look at the sub_area feature.

df[‘sub_area’].value_counts(dropna=False)
view rawstring_lower_case1.py hosted with ❤ by GitHub

It stores the name of different areas and looks very standardized.

But sometimes there is inconsistent capitalization usage within the same feature. The “Poselenie Sosenskoe” and “pOseleNie sosenskeo” could refer to the same area.

What to do?

To avoid this, we can put all letters to lower cases (or upper cases).

# make everything lower case.
df[‘sub_area_lower’] = df[‘sub_area’].str.lower()
df[‘sub_area_lower’].value_counts(dropna=False)
view rawstring_lower_case2.py hosted with ❤ by GitHub
lower cases area names

Inconsistent type #2: Formats

Another standardization we need to perform is the data formats. One example is to convert the feature from string to DateTime format.

How to find out?

The feature timestampis in string format while it represents dates.

df
view rawstring_to_datetime1.py hosted with ❤ by GitHub
standardized formats

What to do?

We can convert it and extract the date or time values by using the code below. After this, it’s easier to analyze the transaction volume group by either year or month.

df[‘timestamp_dt’] = pd.to_datetime(df[‘timestamp’], format=‘%Y-%m-%d’)
df[‘year’] = df[‘timestamp_dt’].dt.year
df[‘month’] = df[‘timestamp_dt’].dt.month
df[‘weekday’] = df[‘timestamp_dt’].dt.weekday
print(df[‘year’].value_counts(dropna=False))
print()
print(df[‘month’].value_counts(dropna=False))
view rawstring_to_datetime2.py hosted with ❤ by GitHub
datetime categories

Related article: How To Manipulate Date And Time In Python Like A Boss

Inconsistent type #3: Categorical Values

Inconsistent categorical values are the last inconsistent type we cover. A categorical feature has a limited number of values. Sometimes there may be other values due to reasons such as typos.

How to find out?

We need to observe the feature to find out this inconsistency. Let’s show this with an example.

We create a new dataset below since we don’t have such a problem in the real estate dataset. For instance, the value of city was typed by mistakes as “torontoo” and “tronto”. But they both refer to the correct value “toronto”.

A simple way to identify them is fuzzy logic (or edit distance). It measures how many letters (distance) we need to change the spelling of one value to match with another value.

We know that the categories should only have four values of “toronto”, “vancouver”, “montreal”, and “calgary”. We calculate the distance between all the values and the word “toronto” (and “vancouver”). We can see that the ones likely to be typos have a smaller distance with the correct word. Since they only differ by a couple of letters.

from nltk.metrics import edit_distance
df_city_ex = pd.DataFrame(data={‘city’: [‘torontoo’, ‘toronto’, ‘tronto’, ‘vancouver’, ‘vancover’, ‘vancouvr’, ‘montreal’, ‘calgary’]})
df_city_ex[‘city_distance_toronto’] = df_city_ex[‘city’].map(lambda x: edit_distance(x, ‘toronto’))
df_city_ex[‘city_distance_vancouver’] = df_city_ex[‘city’].map(lambda x: edit_distance(x, ‘vancouver’))
df_city_ex
view rawfuzzy_logic_distance.py hosted with ❤ by GitHub
distance fuzzy logic

What to do?

We can set criteria to convert these typos to the correct values. For example, the below code sets all the values within 2 letters distance from “toronto” to be “toronto”.

msk = df_city_ex[‘city_distance_toronto’] <= 2
df_city_ex.loc[msk, ‘city’] = ‘toronto’
msk = df_city_ex[‘city_distance_vancouver’] <= 2
df_city_ex.loc[msk, ‘city’] = ‘vancouver’
df_city_ex
view rawfuzzy_logic_replace.py hosted with ❤ by GitHub
fuzzy logic matched

Inconsistent type #4: Addresses

The address feature could be a headache for many of us. Because people entering the data into the database often don’t follow a standard format.

How to find out?

We can find messy address data by looking at it. Even though sometimes we can’t spot any issues, we can still run code to standardize them.

There is no address column in our dataset for privacy reasons. So we create a new dataset df_add_ex with feature address.

# no address column in the housing dataset. So create one to show the code.
df_add_ex = pd.DataFrame([‘123 MAIN St Apartment 15’, ‘123 Main Street Apt 12 ‘, ‘543 FirSt Av’, ‘ 876 FIRst Ave.’], columns=[‘address’])
df_add_ex
view rawaddress_cleaning1.py hosted with ❤ by GitHub

As we can see, the address feature is quite messy.

messy address

What to do?

We run the below code to lowercase the letters, remove white space, delete periods and standardize wordings.

df_add_ex[‘address_std’] = df_add_ex[‘address’].str.lower()
df_add_ex[‘address_std’] = df_add_ex[‘address_std’].str.strip() # remove leading and trailing whitespace.
df_add_ex[‘address_std’] = df_add_ex[‘address_std’].str.replace(‘\\.’, ”) # remove period.
df_add_ex[‘address_std’] = df_add_ex[‘address_std’].str.replace(‘\\bstreet\\b’, ‘st’) # replace street with st.
df_add_ex[‘address_std’] = df_add_ex[‘address_std’].str.replace(‘\\bapartment\\b’, ‘apt’) # replace apartment with apt.
df_add_ex[‘address_std’] = df_add_ex[‘address_std’].str.replace(‘\\bav\\b’, ‘ave’) # replace apartment with apt.
df_add_ex
view rawaddress_cleaning2.py hosted with ❤ by GitHub

It looks much better now.

messy address cleaned up

We did it! What a long journey we have come along.

Clear all the “dirty” data that’s blocking your way to fit the model.

Be the boss of cleaning!

Source: GIPHYThis feature was originally sourced from JustintoData.

For enquiries, product placements, sponsorships, and collaborations, connect with us at [email protected]. We'd love to hear from you!

Our humans need coffee too! Your support is highly appreciated, thank you!

Aelia Vita

Related Topics
  • Clean code
  • Data
  • Data Cleaning
  • Dataset
  • Machine Learning
  • Missing Data
  • Python
You May Also Like
Getting things done makes her feel amazing
View Post
  • Computing
  • Data
  • Featured
  • Learning
  • Tech
  • Technology

Nurturing Minds in the Digital Revolution

  • April 25, 2025
View Post
  • People
  • Technology

AI is automating our jobs – but values need to change if we are to be liberated by it

  • April 17, 2025
View Post
  • Software
  • Technology

Canonical Releases Ubuntu 25.04 Plucky Puffin

  • April 17, 2025
View Post
  • Computing
  • Public Cloud
  • Technology

United States Army Enterprise Cloud Management Agency Expands its Oracle Defense Cloud Services

  • April 15, 2025
View Post
  • Technology

Tokyo Electron and IBM Renew Collaboration for Advanced Semiconductor Technology

  • April 2, 2025
View Post
  • Software
  • Technology

IBM Accelerates Momentum in the as a Service Space with Growing Portfolio of Tools Simplifying Infrastructure Management

  • March 27, 2025
View Post
  • Technology

IBM contributes key open-source projects to Linux Foundation to advance AI community participation

  • March 22, 2025
View Post
  • Technology

Co-op mode: New partners driving the future of gaming with AI

  • March 22, 2025

Stay Connected!
LATEST
  • college-of-cardinals-2025 1
    The Definitive Who’s Who of the 2025 Papal Conclave
    • May 7, 2025
  • conclave-poster-black-smoke 2
    The World Is Revalidating Itself
    • May 6, 2025
  • 3
    Conclave: How A New Pope Is Chosen
    • April 25, 2025
  • Getting things done makes her feel amazing 4
    Nurturing Minds in the Digital Revolution
    • April 25, 2025
  • 5
    AI is automating our jobs – but values need to change if we are to be liberated by it
    • April 17, 2025
  • 6
    Canonical Releases Ubuntu 25.04 Plucky Puffin
    • April 17, 2025
  • 7
    United States Army Enterprise Cloud Management Agency Expands its Oracle Defense Cloud Services
    • April 15, 2025
  • 8
    Tokyo Electron and IBM Renew Collaboration for Advanced Semiconductor Technology
    • April 2, 2025
  • 9
    IBM Accelerates Momentum in the as a Service Space with Growing Portfolio of Tools Simplifying Infrastructure Management
    • March 27, 2025
  • 10
    Tariffs, Trump, and Other Things That Start With T – They’re Not The Problem, It’s How We Use Them
    • March 25, 2025
about
Hello World!

We are aster.cloud. We’re created by programmers for programmers.

Our site aims to provide guides, programming tips, reviews, and interesting materials for tech people and those who want to learn in general.

We would like to hear from you.

If you have any feedback, enquiries, or sponsorship request, kindly reach out to us at:

[email protected]
Most Popular
  • 1
    IBM contributes key open-source projects to Linux Foundation to advance AI community participation
    • March 22, 2025
  • 2
    Co-op mode: New partners driving the future of gaming with AI
    • March 22, 2025
  • 3
    Mitsubishi Motors Canada Launches AI-Powered “Intelligent Companion” to Transform the 2025 Outlander Buying Experience
    • March 10, 2025
  • PiPiPi 4
    The Unexpected Pi-Fect Deals This March 14
    • March 13, 2025
  • Nintendo Switch Deals on Amazon 5
    10 Physical Nintendo Switch Game Deals on MAR10 Day!
    • March 9, 2025
  • /
  • Technology
  • Tools
  • About
  • Contact Us

Input your search keywords and press Enter.