Assignment 4: Creating Reports and Dashboards for Predictive and Prescriptive Analysis¶

Data Exploration and Preparation¶

Part 1: Predictive Analysis with Housing Prices Dataset¶

Preparation¶

Mario Zamudio (NF1002499)¶

University of Niagara Falls Canada

Master of Data Analytics

Data Warehousing/Visualization (CPSC-510-7)

Winter 2025

Ahmed Eltahawi¶

March 25, 2025

Loading Libraries to use¶

In this step, we import all the libraries to use in the process.

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

Loading Dataset and Basic information¶

In this section, the data is loaded and some preprocessing is done after to review the content of the columns in the next steps.

  • Display Configuration: Adjusts the settings to ensure that all columns of the dataset are visible when printed, which is especially helpful for wide datasets.
  • Plot Style Setup: Sets a clean and readable visual style for all plots using the Seaborn library.
  • Data Loading: Reads a CSV file named Housing Prices Dataset.csv and loads it into a DataFrame for analysis.
  • Initial Data Cleaning: Removes any rows that contain missing (NA) values to ensure the initial analysis is performed on complete data.
  • Shape Inspection: Displays the number of rows and columns in the dataset to understand its size and structure.
  • Column Removal: Drops the first two columns from the dataset, likely because they contain identifiers or metadata that are not useful for analysis.
  • Data Type Enforcement: Converts the SalePrice column to a 64-bit integer to ensure consistent numerical formatting and compatibility with later analysis or modeling steps.
  • Data Summary: Prints structural information about the dataset, including column names, data types, and non-null counts, helping to confirm data integrity.
  • Preview: Shows the first five rows of the dataset to get an overview of the data’s contents and layout.
  • Descriptive Statistics: Displays summary statistics (like mean, median, min, max, etc.) for both numerical and categorical variables, rounded to one decimal for better readability.
In [2]:
# Display settings
# Ensures that all columns in the DataFrame will be displayed when printed, 
# which is useful for datasets with many columns.
pd.set_option('display.max_columns', None)

# Sets the visual style for seaborn plots to 'whitegrid' for better readability.
sns.set(style="whitegrid")

# Load the dataset
# The dataset is loaded from a CSV file named "Housing Prices Dataset.csv" 
# into a Pandas DataFrame called 'df'.
df = pd.read_csv("Housing Prices Dataset.csv")

# 1. Initial EDA (Exploratory Data Analysis)

# Remove rows with any missing values (NA) to ensure clean initial analysis.
# This helps avoid issues in summary statistics or visualizations due to incomplete data.
df = df.dropna()

# Print the shape of the dataset (rows, columns) to understand its size.
print("\n--- Shape of Dataset ---")
print(df.shape)

# Remove the first two columns from the DataFrame.
# These might be identifiers or metadata not needed for analysis.
df = df.iloc[:, 2:]

# Force the 'SalePrice' column to be of type int64 (integer).
# This ensures consistent numeric formatting and compatibility with modeling tasks.
df['SalePrice'] = df['SalePrice'].astype('int64')

# Display basic information about the dataset.
# This includes column names, non-null counts, and data types.
# It's useful for checking the integrity and structure of the data.
print("\n--- Dataset Information ---")
df.info()

# Display the first five rows of the dataset.
# Helps in gaining a quick look at the structure and sample values.
print("\n--- First five rows ---")
print(df.head())

# Summary statistics
# Provides descriptive statistics (count, mean, std, min, max, etc.)
# for both numerical and categorical columns.
# Rounded to 1 decimal for better readability.
print("\n--- Summary Statistics ---")
print(df.describe(include='all').round(1))
--- Shape of Dataset ---
(1460, 52)

--- Dataset Information ---
<class 'pandas.core.frame.DataFrame'>
Index: 1460 entries, 0 to 2889
Data columns (total 50 columns):
 #   Column         Non-Null Count  Dtype
---  ------         --------------  -----
 0   LotFrontage    1460 non-null   int64
 1   LotArea        1460 non-null   int64
 2   OverallQual    1460 non-null   int64
 3   OverallCond    1460 non-null   int64
 4   YearBuilt      1460 non-null   int64
 5   YearRemodAdd   1460 non-null   int64
 6   MasVnrArea     1460 non-null   int64
 7   ExterQual      1460 non-null   int64
 8   ExterCond      1460 non-null   int64
 9   BsmtQual       1460 non-null   int64
 10  BsmtCond       1460 non-null   int64
 11  BsmtExposure   1460 non-null   int64
 12  BsmtFinType1   1460 non-null   int64
 13  BsmtFinSF1     1460 non-null   int64
 14  BsmtFinType2   1460 non-null   int64
 15  BsmtFinSF2     1460 non-null   int64
 16  BsmtUnfSF      1460 non-null   int64
 17  TotalBsmtSF    1460 non-null   int64
 18  HeatingQC      1460 non-null   int64
 19  1stFlrSF       1460 non-null   int64
 20  2ndFlrSF       1460 non-null   int64
 21  LowQualFinSF   1460 non-null   int64
 22  GrLivArea      1460 non-null   int64
 23  BsmtFullBath   1460 non-null   int64
 24  BsmtHalfBath   1460 non-null   int64
 25  FullBath       1460 non-null   int64
 26  HalfBath       1460 non-null   int64
 27  BedroomAbvGr   1460 non-null   int64
 28  KitchenAbvGr   1460 non-null   int64
 29  KitchenQual    1460 non-null   int64
 30  TotRmsAbvGrd   1460 non-null   int64
 31  Fireplaces     1460 non-null   int64
 32  FireplaceQu    1460 non-null   int64
 33  GarageYrBlt    1460 non-null   int64
 34  GarageFinish   1460 non-null   int64
 35  GarageCars     1460 non-null   int64
 36  GarageArea     1460 non-null   int64
 37  GarageQual     1460 non-null   int64
 38  GarageCond     1460 non-null   int64
 39  PavedDrive     1460 non-null   int64
 40  WoodDeckSF     1460 non-null   int64
 41  OpenPorchSF    1460 non-null   int64
 42  EnclosedPorch  1460 non-null   int64
 43  3SsnPorch      1460 non-null   int64
 44  ScreenPorch    1460 non-null   int64
 45  PoolArea       1460 non-null   int64
 46  Fence          1460 non-null   int64
 47  MiscVal        1460 non-null   int64
 48  YrSold         1460 non-null   int64
 49  SalePrice      1460 non-null   int64
dtypes: int64(50)
memory usage: 581.7 KB

--- First five rows ---
   LotFrontage  LotArea  OverallQual  OverallCond  YearBuilt  YearRemodAdd  \
0           21     1596            4            5       1973          1973   
1           21     1890            4            7       1972          1972   
2           21     1526            4            8       1970          2002   
3           21     1491            4            6       1972          1972   
4           21     1936            4            6       1970          1970   

   MasVnrArea  ExterQual  ExterCond  BsmtQual  BsmtCond  BsmtExposure  \
0           0          2          2         4         3             4   
1           0          2          3         4         3             3   
2           0          2          3         4         3             3   
3           0          2          2         4         3             3   
4           0          2          2         4         3             3   

   BsmtFinType1  BsmtFinSF1  BsmtFinType2  BsmtFinSF2  BsmtUnfSF  TotalBsmtSF  \
0             6         462             1           0          0          462   
1             5         495             1           0        135          630   
2             6         515             1           0        115          630   
3             2         150             6         480          0          630   
4             4         131             6         499          0          630   

   HeatingQC  1stFlrSF  2ndFlrSF  LowQualFinSF  GrLivArea  BsmtFullBath  \
0          2       526       462             0        988             1   
1          3       630         0             0        630             1   
2          2       630         0             0        630             1   
3          4       630         0             0        630             1   
4          3       630         0             0        630             1   

   BsmtHalfBath  FullBath  HalfBath  BedroomAbvGr  KitchenAbvGr  KitchenQual  \
0             0         1         0             2             1            2   
1             0         1         0             1             1            2   
2             0         1         0             1             1            3   
3             0         1         0             1             1            2   
4             0         1         0             1             1            2   

   TotRmsAbvGrd  Fireplaces  FireplaceQu  GarageYrBlt  GarageFinish  \
0             5           0            0         1973             1   
1             3           0            0            0             0   
2             3           0            0         1970             1   
3             3           0            0            0             0   
4             3           0            0            0             0   

   GarageCars  GarageArea  GarageQual  GarageCond  PavedDrive  WoodDeckSF  \
0           1         297           3           3           3         120   
1           0           0           0           0           3          88   
2           1         286           3           3           3           0   
3           0           0           0           0           3          96   
4           0           0           0           0           3           0   

   OpenPorchSF  EnclosedPorch  3SsnPorch  ScreenPorch  PoolArea  Fence  \
0          101              0          0            0         0      2   
1            0              0          0            0         0      0   
2            0              0          0            0         0      0   
3           24              0          0            0         0      0   
4            0              0          0            0         0      3   

   MiscVal  YrSold  SalePrice  
0        0    2009      91000  
1        0    2008      81000  
2        0    2009      86000  
3        0    2010      75500  
4        0    2007      84500  

--- Summary Statistics ---
       LotFrontage   LotArea  OverallQual  OverallCond  YearBuilt  \
count       1460.0    1460.0       1460.0       1460.0     1460.0   
mean          57.6   10516.8          6.1          5.6     1971.3   
std           34.7    9981.3          1.4          1.1       30.2   
min            0.0    1300.0          1.0          1.0     1872.0   
25%           42.0    7553.5          5.0          5.0     1954.0   
50%           63.0    9478.5          6.0          5.0     1973.0   
75%           79.0   11601.5          7.0          6.0     2000.0   
max          313.0  215245.0         10.0          9.0     2010.0   

       YearRemodAdd  MasVnrArea  ExterQual  ExterCond  BsmtQual  BsmtCond  \
count        1460.0      1460.0     1460.0     1460.0    1460.0    1460.0   
mean         1978.5       103.1        2.4        2.1       3.5       2.9   
std            25.4       180.7        0.6        0.4       0.9       0.6   
min          1900.0         0.0        1.0        0.0       0.0       0.0   
25%          1959.0         0.0        2.0        2.0       3.0       3.0   
50%          1980.0         0.0        2.0        2.0       4.0       3.0   
75%          2003.0       164.2        3.0        2.0       4.0       3.0   
max          2010.0      1600.0        4.0        4.0       5.0       4.0   

       BsmtExposure  BsmtFinType1  BsmtFinSF1  BsmtFinType2  BsmtFinSF2  \
count        1460.0        1460.0      1460.0        1460.0      1460.0   
mean            1.6           3.5       443.6           1.2        46.5   
std             1.1           2.1       456.1           0.9       161.3   
min             0.0           0.0         0.0           0.0         0.0   
25%             1.0           1.0         0.0           1.0         0.0   
50%             1.0           4.0       383.5           1.0         0.0   
75%             2.0           6.0       712.2           1.0         0.0   
max             4.0           6.0      5644.0           6.0      1474.0   

       BsmtUnfSF  TotalBsmtSF  HeatingQC  1stFlrSF  2ndFlrSF  LowQualFinSF  \
count     1460.0       1460.0     1460.0    1460.0    1460.0        1460.0   
mean       567.2       1057.4        3.1    1162.6     347.0           5.8   
std        441.9        438.7        1.0     386.6     436.5          48.6   
min          0.0          0.0        0.0     334.0       0.0           0.0   
25%        223.0        795.8        2.0     882.0       0.0           0.0   
50%        477.5        991.5        4.0    1087.0       0.0           0.0   
75%        808.0       1298.2        4.0    1391.2     728.0           0.0   
max       2336.0       6110.0        4.0    4692.0    2065.0         572.0   

       GrLivArea  BsmtFullBath  BsmtHalfBath  FullBath  HalfBath  \
count     1460.0        1460.0        1460.0    1460.0    1460.0   
mean      1515.5           0.4           0.1       1.6       0.4   
std        525.5           0.5           0.2       0.6       0.5   
min        334.0           0.0           0.0       0.0       0.0   
25%       1129.5           0.0           0.0       1.0       0.0   
50%       1464.0           0.0           0.0       2.0       0.0   
75%       1776.8           1.0           0.0       2.0       1.0   
max       5642.0           3.0           2.0       3.0       2.0   

       BedroomAbvGr  KitchenAbvGr  KitchenQual  TotRmsAbvGrd  Fireplaces  \
count        1460.0        1460.0       1460.0        1460.0      1460.0   
mean            2.9           1.0          2.5           6.5         0.6   
std             0.8           0.2          0.7           1.6         0.6   
min             0.0           0.0          1.0           2.0         0.0   
25%             2.0           1.0          2.0           5.0         0.0   
50%             3.0           1.0          2.0           6.0         1.0   
75%             3.0           1.0          3.0           7.0         1.0   
max             8.0           3.0          4.0          14.0         3.0   

       FireplaceQu  GarageYrBlt  GarageFinish  GarageCars  GarageArea  \
count       1460.0       1460.0        1460.0      1460.0      1460.0   
mean           1.8       1868.7           1.7         1.8       473.0   
std            1.8        453.7           0.9         0.7       213.8   
min            0.0          0.0           0.0         0.0         0.0   
25%            0.0       1958.0           1.0         1.0       334.5   
50%            2.0       1977.0           2.0         2.0       480.0   
75%            4.0       2001.0           2.0         2.0       576.0   
max            5.0       2010.0           3.0         4.0      1418.0   

       GarageQual  GarageCond  PavedDrive  WoodDeckSF  OpenPorchSF  \
count      1460.0      1460.0      1460.0      1460.0       1460.0   
mean          2.8         2.8         2.9        94.2         46.7   
std           0.7         0.7         0.5       125.3         66.3   
min           0.0         0.0         1.0         0.0          0.0   
25%           3.0         3.0         3.0         0.0          0.0   
50%           3.0         3.0         3.0         0.0         25.0   
75%           3.0         3.0         3.0       168.0         68.0   
max           5.0         5.0         3.0       857.0        547.0   

       EnclosedPorch  3SsnPorch  ScreenPorch  PoolArea   Fence  MiscVal  \
count         1460.0     1460.0       1460.0    1460.0  1460.0   1460.0   
mean            22.0        3.4         15.1       2.8     0.6     43.5   
std             61.1       29.3         55.8      40.2     1.2    496.1   
min              0.0        0.0          0.0       0.0     0.0      0.0   
25%              0.0        0.0          0.0       0.0     0.0      0.0   
50%              0.0        0.0          0.0       0.0     0.0      0.0   
75%              0.0        0.0          0.0       0.0     0.0      0.0   
max            552.0      508.0        480.0     738.0     4.0  15500.0   

       YrSold  SalePrice  
count  1460.0     1460.0  
mean   2007.8   180921.2  
std       1.3    79442.5  
min    2006.0    34900.0  
25%    2007.0   129975.0  
50%    2008.0   163000.0  
75%    2009.0   214000.0  
max    2010.0   755000.0  

Checking Missing Values¶

Identifies if there are any missing values in the dataset, which is crucial for data cleaning, after the first check of the data.

In [3]:
# Check for missing values
# Identifies if there are any missing values in the dataset, which is crucial for data cleaning.
print("\nMissing Values:")
print(df.isnull().sum())
Missing Values:
LotFrontage      0
LotArea          0
OverallQual      0
OverallCond      0
YearBuilt        0
YearRemodAdd     0
MasVnrArea       0
ExterQual        0
ExterCond        0
BsmtQual         0
BsmtCond         0
BsmtExposure     0
BsmtFinType1     0
BsmtFinSF1       0
BsmtFinType2     0
BsmtFinSF2       0
BsmtUnfSF        0
TotalBsmtSF      0
HeatingQC        0
1stFlrSF         0
2ndFlrSF         0
LowQualFinSF     0
GrLivArea        0
BsmtFullBath     0
BsmtHalfBath     0
FullBath         0
HalfBath         0
BedroomAbvGr     0
KitchenAbvGr     0
KitchenQual      0
TotRmsAbvGrd     0
Fireplaces       0
FireplaceQu      0
GarageYrBlt      0
GarageFinish     0
GarageCars       0
GarageArea       0
GarageQual       0
GarageCond       0
PavedDrive       0
WoodDeckSF       0
OpenPorchSF      0
EnclosedPorch    0
3SsnPorch        0
ScreenPorch      0
PoolArea         0
Fence            0
MiscVal          0
YrSold           0
SalePrice        0
dtype: int64

Reviewing Distribution of Numerical Variables¶

In this piece of code, we check the basic information to detect trends, outliers and correlations to be used in the next steps of the process.

  • Identifying Numeric Columns: Selects all columns in the dataset that contain numerical data types to prepare them for visualization.
  • Creating Histograms: Generates histograms for each numeric feature to show the distribution of their values (e.g., how frequently values occur within certain ranges). This helps in detecting skewness, peaks, or gaps in the data.
  • Adjusting Layout and Titles: Sets a title for the collection of histograms, arranges the plots to fit nicely within the figure, and displays them.
  • Generating Correlation Heatmap: Creates a heatmap that visualizes the correlation matrix of the dataset, showing the strength and direction of relationships between numeric features.
  • Styling the Heatmap: Applies a color scheme to indicate correlation strength and removes internal annotations to keep the visual clean.
  • Displaying the Heatmap: Presents the finalized heatmap chart to help identify patterns or multicollinearity among features.
In [4]:
# Visualizing distributions of numerical features
numeric_cols = df.select_dtypes(include=[np.number]).columns.tolist()
df[numeric_cols].hist(bins=20, figsize=(15, 10))
plt.suptitle('Distributions of Numeric Features')
plt.tight_layout()
plt.show()

# Correlation heatmap
plt.figure(figsize=(15, 15))
sns.heatmap(df.corr(), annot=False, cmap='coolwarm')
plt.title('Correlation Matrix')
plt.show()
No description has been provided for this image
No description has been provided for this image

Data Cleaning¶

The following steps involve coding for data cleaning and exporting it to a file for future use.

Handling Missing Values:¶

  • Identifies columns that still contain missing values.
  • For categorical columns, it fills missing values with the most frequent category (mode).
  • For numerical columns, it fills missing values with the median.
  • This ensures the dataset is complete before further analysis.

Boxplots Before Outlier Handling:¶

  • Creates boxplots for all numeric features to visually inspect their value distributions and detect outliers.
  • Organizes plots in a grid layout for clarity.
  • Helps understand which variables have extreme values that may need correction.

Outlier Detection and Treatment (IQR Method):¶

  • For each numeric column, calculates the interquartile range (IQR).
  • Defines lower and upper bounds to identify outliers.
  • Clips values outside these bounds to the nearest acceptable values (winsorization).
  • Reduces the influence of extreme values on the analysis or modeling.

Ensuring Integer Data Types:¶

  • After clipping, it checks if the original data type of a column was integer.
  • If so, it rounds the values and converts them back to integer format to maintain data consistency.

Boxplots After Outlier Handling:¶

  • Recreates the boxplots to show the new distribution of numeric features after outliers have been handled.
  • Useful for visual comparison with the earlier boxplots.

Dataset Summary After Cleaning:¶

Displays structure and data types of the final cleaned dataset.

Saving the Cleaned Dataset:¶

Writes the cleaned DataFrame to a CSV file for future use or modeling.

In [5]:
# Handle missing values (replace with zero if NA still exists after imputation)
missing_cols = df.columns[df.isnull().any()].tolist()
for col in missing_cols:
    if df[col].dtype == 'object':
        df[col].fillna(df[col].mode()[0], inplace=True)
    else:
        df[col].fillna(df[col].median(), inplace=True)

# Boxplot diagrams before handling outliers
print("\n--- Boxplots Before Handling Outliers ---")
num_plots = len(numeric_cols)
cols = 4
rows = (num_plots + cols - 1) // cols
plt.figure(figsize=(5 * cols, 4 * rows))
for i, col in enumerate(numeric_cols):
    plt.subplot(rows, cols, i + 1)
    sns.boxplot(y=df[col])
    plt.title(col, fontsize=10)
    plt.xticks(rotation=45)
plt.tight_layout()
plt.show()

# Handling outliers (optional: use IQR method for numeric features)
for col in numeric_cols:
    Q1 = df[col].quantile(0.25)
    Q3 = df[col].quantile(0.75)
    IQR = Q3 - Q1
    lower_bound = Q1 - 1.5 * IQR
    upper_bound = Q3 + 1.5 * IQR
    outliers = df[(df[col] < lower_bound) | (df[col] > upper_bound)]
    if not outliers.empty:
        print(f"{col}: {len(outliers)} outliers detected")
        # Optional: clip or remove outliers
        df[col] = df[col].clip(lower_bound, upper_bound)

# Ensure all numeric columns are integers if they originally were
for col in numeric_cols:
    if pd.api.types.is_integer_dtype(df[col].dropna().astype(int)):
        df[col] = df[col].round().astype(int)

# Boxplot diagrams after handling outliers
print("\n--- Boxplots After Handling Outliers ---")
plt.figure(figsize=(5 * cols, 4 * rows))
for i, col in enumerate(numeric_cols):
    plt.subplot(rows, cols, i + 1)
    sns.boxplot(y=df[col])
    plt.title(col, fontsize=10)
    plt.xticks(rotation=45)
plt.tight_layout()
plt.show()

print("\n--- Dataset After Cleaning ---")
print(df.info())

# Save cleaned dataset
cleaned_file_path = 'Housing_Prices_Cleaned.csv'
df.to_csv(cleaned_file_path, index=False)
print(f"\nCleaned dataset saved to: {cleaned_file_path}")
--- Boxplots Before Handling Outliers ---
No description has been provided for this image
LotFrontage: 16 outliers detected
LotArea: 69 outliers detected
OverallQual: 2 outliers detected
OverallCond: 125 outliers detected
YearBuilt: 7 outliers detected
MasVnrArea: 98 outliers detected
ExterCond: 178 outliers detected
BsmtQual: 37 outliers detected
BsmtCond: 149 outliers detected
BsmtExposure: 134 outliers detected
BsmtFinSF1: 7 outliers detected
BsmtFinType2: 203 outliers detected
BsmtFinSF2: 167 outliers detected
BsmtUnfSF: 29 outliers detected
TotalBsmtSF: 61 outliers detected
1stFlrSF: 20 outliers detected
2ndFlrSF: 2 outliers detected
LowQualFinSF: 26 outliers detected
GrLivArea: 31 outliers detected
BsmtFullBath: 1 outliers detected
BsmtHalfBath: 82 outliers detected
BedroomAbvGr: 35 outliers detected
KitchenAbvGr: 68 outliers detected
TotRmsAbvGrd: 30 outliers detected
Fireplaces: 5 outliers detected
GarageYrBlt: 81 outliers detected
GarageCars: 5 outliers detected
GarageArea: 21 outliers detected
GarageQual: 149 outliers detected
GarageCond: 134 outliers detected
PavedDrive: 120 outliers detected
WoodDeckSF: 32 outliers detected
OpenPorchSF: 77 outliers detected
EnclosedPorch: 208 outliers detected
3SsnPorch: 24 outliers detected
ScreenPorch: 116 outliers detected
PoolArea: 7 outliers detected
Fence: 281 outliers detected
MiscVal: 52 outliers detected
SalePrice: 61 outliers detected

--- Boxplots After Handling Outliers ---
No description has been provided for this image
--- Dataset After Cleaning ---
<class 'pandas.core.frame.DataFrame'>
Index: 1460 entries, 0 to 2889
Data columns (total 50 columns):
 #   Column         Non-Null Count  Dtype
---  ------         --------------  -----
 0   LotFrontage    1460 non-null   int32
 1   LotArea        1460 non-null   int32
 2   OverallQual    1460 non-null   int32
 3   OverallCond    1460 non-null   int32
 4   YearBuilt      1460 non-null   int32
 5   YearRemodAdd   1460 non-null   int32
 6   MasVnrArea     1460 non-null   int32
 7   ExterQual      1460 non-null   int32
 8   ExterCond      1460 non-null   int32
 9   BsmtQual       1460 non-null   int32
 10  BsmtCond       1460 non-null   int32
 11  BsmtExposure   1460 non-null   int32
 12  BsmtFinType1   1460 non-null   int32
 13  BsmtFinSF1     1460 non-null   int32
 14  BsmtFinType2   1460 non-null   int32
 15  BsmtFinSF2     1460 non-null   int32
 16  BsmtUnfSF      1460 non-null   int32
 17  TotalBsmtSF    1460 non-null   int32
 18  HeatingQC      1460 non-null   int32
 19  1stFlrSF       1460 non-null   int32
 20  2ndFlrSF       1460 non-null   int32
 21  LowQualFinSF   1460 non-null   int32
 22  GrLivArea      1460 non-null   int32
 23  BsmtFullBath   1460 non-null   int32
 24  BsmtHalfBath   1460 non-null   int32
 25  FullBath       1460 non-null   int32
 26  HalfBath       1460 non-null   int32
 27  BedroomAbvGr   1460 non-null   int32
 28  KitchenAbvGr   1460 non-null   int32
 29  KitchenQual    1460 non-null   int32
 30  TotRmsAbvGrd   1460 non-null   int32
 31  Fireplaces     1460 non-null   int32
 32  FireplaceQu    1460 non-null   int32
 33  GarageYrBlt    1460 non-null   int32
 34  GarageFinish   1460 non-null   int32
 35  GarageCars     1460 non-null   int32
 36  GarageArea     1460 non-null   int32
 37  GarageQual     1460 non-null   int32
 38  GarageCond     1460 non-null   int32
 39  PavedDrive     1460 non-null   int32
 40  WoodDeckSF     1460 non-null   int32
 41  OpenPorchSF    1460 non-null   int32
 42  EnclosedPorch  1460 non-null   int32
 43  3SsnPorch      1460 non-null   int32
 44  ScreenPorch    1460 non-null   int32
 45  PoolArea       1460 non-null   int32
 46  Fence          1460 non-null   int32
 47  MiscVal        1460 non-null   int32
 48  YrSold         1460 non-null   int32
 49  SalePrice      1460 non-null   int32
dtypes: int32(50)
memory usage: 296.6 KB
None

Cleaned dataset saved to: Housing_Prices_Cleaned.csv
In [ ]: