Wine Quality prediction using Random Forest Classifier
Wine Quality prediction using Random Forest Classifier
This project focused on predicting the quality of wines based on their physicochemical properties, leveraging the Random Forest Classifier for robust and accurate classification. The model provides valuable insights for winemakers and distributors in maintaining quality standards.
Methodology:
Data Preprocessing:
Utilized a dataset containing chemical attributes of wine, such as acidity, sugar levels, pH, and alcohol content.
Handled missing values, standardized numerical features, and encoded categorical variables for seamless model training.
Feature Engineering:
Conducted exploratory data analysis to identify key features influencing wine quality.
Applied feature importance metrics from Random Forest to rank predictors.
Model Training:
Trained a Random Forest Classifier to predict wine quality as a categorical variable ranging from low to high.
Balanced the dataset using oversampling techniques to handle class imbalances.
Hyperparameter Optimization:
Optimized the number of estimators, maximum depth, and minimum sample split through grid search for enhanced performance.
Key Results and Insights:
Achieved an accuracy of XX% and an F1-score of YY%, indicating robust classification performance.
Identified alcohol content and volatile acidity as the most influential factors in determining wine quality.
Demonstrated the versatility of Random Forest in handling high-dimensional data with interpretability and reliability.
Skills and Tools:
Machine Learning (Random Forest Classifier), Python (Pandas, NumPy, Scikit-learn, Matplotlib, Seaborn), Data Preprocessing, and Model Evaluation.
This project showcases the power of machine learning in quality assurance and its application in the food and beverage industry to optimize production standards and customer satisfaction.
Github link: https://github.com/sm98code/Wine-quality-prediction.git
Visualization Output