Spaces:

ziadmostafa
/

Road-Accidents-Severity-Analysis

Sleeping

App Files Files Community

ziadmostafa commited on Apr 18

Commit

3771b6c

1 Parent(s): 1f15f83

first commit

Browse files

Files changed (10) hide show

.gitignore +47 -0
README.md +90 -2
RTA Dataset.csv +0 -0
Road_Accidents_Severity_Analysis.ipynb +0 -0
app.py +505 -0
best_accident_severity_model.pkl +3 -0
huggingface-metadata.yml +3 -0
label_encoders.pkl +3 -0
requirements.txt +9 -0
scaler.pkl +3 -0

.gitignore ADDED Viewed

	@@ -0,0 +1,47 @@

+# Python-related
+__pycache__/
+*.py[cod]
+*$py.class
+*.so
+.Python
+env/
+build/
+develop-eggs/
+dist/
+downloads/
+eggs/
+.eggs/
+lib/
+lib64/
+parts/
+sdist/
+var/
+*.egg-info/
+.installed.cfg
+*.egg
+# Jupyter Notebook
+.ipynb_checkpoints
+# Virtual Environment
+venv/
+ENV/
+env/
+# IDE-related
+.idea/
+.vscode/
+*.swp
+*.swo
+# OS-related
+.DS_Store
+Thumbs.db
+# Large data files
+# Uncomment if you don't want to include the dataset
+# *.csv
+# Logs
+logs/
+*.log

README.md CHANGED Viewed

@@ -1,6 +1,6 @@
 ---
 title: Road Accidents Severity Analysis
-emoji: 🏢
 colorFrom: pink
 colorTo: purple
 sdk: streamlit
@@ -10,4 +10,92 @@ pinned: false
 short_description: RTA
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 ---
 title: Road Accidents Severity Analysis
+emoji: 🚗
 colorFrom: pink
 colorTo: purple
 sdk: streamlit
 short_description: RTA
 ---
+# Road Accidents Severity Analysis
+![Road Safety](https://img.shields.io/badge/Road-Safety-red)
+![Data Science](https://img.shields.io/badge/Data-Science-blue)
+![Machine Learning](https://img.shields.io/badge/Machine-Learning-green)
+## 📝 Project Description
+This project analyzes road traffic accident (RTA) data to identify patterns and factors that contribute to accident severity. Using machine learning models, we predict the severity of accidents based on various factors such as driver characteristics, vehicle conditions, road features, and environmental conditions.
+The insights from this analysis can help:
+- Identify high-risk scenarios for road accidents
+- Recommend preventive measures to reduce accident severity
+- Support traffic management and road safety policies
+- Raise awareness about factors contributing to severe accidents
+## 🔍 Dataset
+The dataset contains over 12,000 records of road traffic accidents with 32+ features including:
+- Driver information (age, gender, experience, education)
+- Vehicle details (type, service years, defects)
+- Road conditions and features
+- Environmental factors (weather, light conditions)
+- Accident details (collision type, vehicles involved, casualties)
+- Accident severity (target variable)
+## 🚀 Features
+- **Comprehensive Data Analysis**: Explore patterns and relationships in road accident data
+- **Interactive Visualizations**: 8+ interactive charts to understand accident factors
+- **Predictive Modeling**: Machine learning models to predict accident severity
+- **User-friendly Interface**: Input accident details to get severity predictions
+- **Feature Importance Analysis**: Understand which factors most influence accident severity
+## 🛠️ Installation & Setup
+1. Clone the repository:
+2. Install dependencies:
+   ```bash
+   pip install -r requirements.txt
+   ```
+3. Run the Jupyter notebook to train models:
+   ```bash
+   jupyter notebook Road_Accidents_Severity_Analysis.ipynb
+   ```
+4. Launch the Streamlit app:
+   ```bash
+   streamlit run app.py
+   ```
+## 🔧 Technologies Used
+- **Data Processing**: Pandas, NumPy
+- **Visualization**: Plotly, Cufflinks
+- **Machine Learning**: Scikit-learn
+- **Web Application**: Streamlit
+- **Other Tools**: Jupyter Notebook, Python
+## 📁 Project Structure
+```
+road-accidents-severity/
+├── Road_Accidents_Severity_Analysis.ipynb  # Analysis & model training
+├── app.py                                  # Streamlit application
+├── RTA Dataset.csv                         # Dataset
+├── requirements.txt                        # Dependencies
+├── README.md                               # Project documentation
+├── best_accident_severity_model.pkl        # Trained model
+├── label_encoders.pkl                      # Saved encoders
+└── scaler.pkl                              # Saved scaler
+```
+## 🔮 Future Improvements
+- Incorporate geographic data for spatial analysis
+- Implement more advanced models (e.g., XGBoost, neural networks)
+- Add time series analysis to identify temporal patterns
+- Develop a mobile app for on-the-go predictions
+- Include more interactive features in the dashboard
+## 👥 Contributors
+- [Ziad Mostafa](https://github.com/ziadmostafa1)

RTA Dataset.csv ADDED Viewed

The diff for this file is too large to render. See raw diff

Road_Accidents_Severity_Analysis.ipynb ADDED Viewed

The diff for this file is too large to render. See raw diff

app.py ADDED Viewed

	@@ -0,0 +1,505 @@

+import streamlit as st
+import pandas as pd
+import numpy as np
+import plotly.express as px
+import plotly.graph_objects as go
+import joblib
+from sklearn.preprocessing import LabelEncoder, StandardScaler
+from sklearn.decomposition import PCA
+# Set page configuration
+st.set_page_config(
+    page_title="Road Accidents Severity Analysis",
+    page_icon="🚗",
+    layout="wide",
+    initial_sidebar_state="expanded"
+)
+# Load the data
+@st.cache_data
+def load_data():
+    df = pd.read_csv('RTA Dataset.csv')
+    return df
+# Load the model and preprocessing objects
+@st.cache_resource
+def load_model():
+    try:
+        model = joblib.load('best_accident_severity_model.pkl')
+        label_encoders = joblib.load('label_encoders.pkl')
+        scaler = joblib.load('scaler.pkl')
+        return model, label_encoders, scaler
+    except Exception as e:
+        st.warning(f"Model files not found or error loading: {e}")
+        return None, None, None
+# Main function
+def main():
+    # Add a header
+    st.title("🚗 Road Accidents Severity Analysis Dashboard")
+    # Create tabs
+    tab1, tab2, tab3, tab4 = st.tabs(["📊 Data Overview", "📈 Visualizations", "🔍 Feature Analysis", "🤖 Prediction"])
+    # Load data
+    df = load_data()
+    # Load model and preprocessing objects
+    model, label_encoders, scaler = load_model()
+    # Tab 1: Data Overview
+    with tab1:
+        st.header("Dataset Overview")
+        st.write(f"Dataset Shape: {df.shape}")
+        # Display sample data
+        st.subheader("Sample Data")
+        st.dataframe(df.head())
+        # Display summary statistics
+        st.subheader("Summary Statistics")
+        st.dataframe(df.describe())
+        # Display missing values information
+        st.subheader("Missing Values")
+        missing_values = df.isnull().sum()
+        missing_percentage = (missing_values / len(df)) * 100
+        missing_df = pd.DataFrame({
+            'Missing Values': missing_values,
+            'Percentage': missing_percentage
+        })
+        missing_df = missing_df[missing_df['Missing Values'] > 0].sort_values('Percentage', ascending=False)
+        if not missing_df.empty:
+            st.dataframe(missing_df)
+        else:
+            st.write("No missing values in the dataset.")
+    # Tab 2: Visualizations
+    with tab2:
+        st.header("Data Visualizations")
+        # Create two columns
+        col1, col2 = st.columns(2)
+        with col1:
+            # Pie chart of accident severity
+            st.subheader("Accident Severity Distribution")
+            fig1 = px.pie(df, names='Accident_severity', title='Distribution of Accident Severity',
+                          color_discrete_sequence=px.colors.sequential.RdBu)
+            fig1.update_traces(textposition='inside', textinfo='percent+label')
+            st.plotly_chart(fig1, use_container_width=True)
+            # Bar chart of accident causes
+            st.subheader("Top Causes of Accidents")
+            cause_counts = df['Cause_of_accident'].value_counts().reset_index()
+            cause_counts.columns = ['Cause', 'Count']
+            cause_counts = cause_counts.sort_values('Count', ascending=False).head(10)
+            fig2 = px.bar(cause_counts, x='Count', y='Cause',
+                          title='Top 10 Causes of Accidents',
+                          orientation='h',
+                          color='Count',
+                          color_continuous_scale=px.colors.sequential.Viridis)
+            st.plotly_chart(fig2, use_container_width=True)
+        with col2:
+            # Histogram of casualties
+            st.subheader("Distribution of Casualties")
+            fig3 = px.histogram(df, x='Number_of_casualties',
+                               title='Distribution of Number of Casualties',
+                               nbins=30, color_discrete_sequence=['indianred'])
+            fig3.update_layout(bargap=0.2)
+            st.plotly_chart(fig3, use_container_width=True)
+            # Box plot of vehicles involved by severity
+            st.subheader("Vehicles Involved by Accident Severity")
+            fig4 = px.box(df, x='Accident_severity', y='Number_of_vehicles_involved',
+                         title='Number of Vehicles Involved by Accident Severity',
+                         color='Accident_severity', notched=True)
+            st.plotly_chart(fig4, use_container_width=True)
+        # Full width plots
+        st.subheader("Vehicle Types in Accidents")
+        vehicle_counts = df['Type_of_vehicle'].value_counts().reset_index()
+        vehicle_counts.columns = ['Vehicle Type', 'Count']
+        vehicle_counts = vehicle_counts.head(8)  # Top 8 vehicle types
+        fig5 = px.pie(vehicle_counts, values='Count', names='Vehicle Type',
+                     title='Distribution of Vehicle Types in Accidents',
+                     hole=0.4, color_discrete_sequence=px.colors.sequential.Plasma_r)
+        fig5.update_traces(textposition='inside', textinfo='percent+label')
+        st.plotly_chart(fig5, use_container_width=True)
+        # Relationship between vehicles and casualties
+        st.subheader("Relationship Between Vehicles and Casualties")
+        fig6 = px.scatter(df, x='Number_of_vehicles_involved', y='Number_of_casualties',
+                         color='Accident_severity', size='Number_of_casualties',
+                         title='Relationship Between Vehicles Involved and Casualties',
+                         opacity=0.7)
+        fig6.update_traces(marker=dict(line=dict(width=0.5, color='DarkSlateGrey')))
+        st.plotly_chart(fig6, use_container_width=True)
+    # Tab 3: Feature Analysis
+    with tab3:
+        st.header("Feature Analysis")
+        # Feature exploration
+        feature_col1, feature_col2 = st.columns([1, 2])
+        with feature_col1:
+            st.subheader("Feature Selection")
+            feature_options = df.columns.tolist()
+            selected_feature = st.selectbox("Select a feature to analyze:", feature_options)
+            if selected_feature:
+                if df[selected_feature].dtype in ['int64', 'float64']:
+                    st.write(f"Statistical Summary for {selected_feature}:")
+                    st.dataframe(df[selected_feature].describe())
+                else:
+                    value_counts = df[selected_feature].value_counts().reset_index()
+                    value_counts.columns = ['Value', 'Count']
+                    st.write(f"Value Counts for {selected_feature}:")
+                    st.dataframe(value_counts)
+        with feature_col2:
+            if selected_feature:
+                st.subheader(f"Visualization for {selected_feature}")
+                if df[selected_feature].dtype in ['int64', 'float64']:
+                    # Numerical feature
+                    fig = px.histogram(df, x=selected_feature, color='Accident_severity',
+                                       title=f'Distribution of {selected_feature} by Accident Severity',
+                                       marginal='box')
+                else:
+                    # Categorical feature
+                    cat_counts = df.groupby([selected_feature, 'Accident_severity']).size().reset_index(name='Count')
+                    fig = px.bar(cat_counts, x=selected_feature, y='Count', color='Accident_severity',
+                                 title=f'{selected_feature} vs Accident Severity',
+                                 barmode='group')
+                st.plotly_chart(fig, use_container_width=True)
+        # Feature correlation analysis
+        st.subheader("Feature Correlation Analysis")
+        # Get only numerical columns for correlation
+        numeric_df = df.select_dtypes(include=['int64', 'float64'])
+        # Calculate and plot correlation matrix
+        corr_matrix = numeric_df.corr()
+        fig_corr = px.imshow(corr_matrix, text_auto=True, color_continuous_scale='RdBu_r',
+                            title='Correlation Matrix of Numerical Features')
+        st.plotly_chart(fig_corr, use_container_width=True)
+    # Tab 4: Prediction
+    with tab4:
+        st.header("Accident Severity Prediction")
+        if model is None:
+            st.error("Model not loaded. Please run the notebook first to train and save the model.")
+        else:
+            st.write("Enter the details below to predict accident severity:")
+            # Get all required features from the model
+            expected_features = []
+            if hasattr(model, 'feature_names_in_'):
+                expected_features = list(model.feature_names_in_)
+                st.write(f"The model expects {len(expected_features)} features.")
+            # Create layout for input features
+            col1, col2, col3 = st.columns(3)
+            # Create a dictionary to store all input values
+            input_data = {}
+            with col1:
+                # Time-related inputs
+                time_options = ["Morning (6AM-12PM)", "Afternoon (12PM-6PM)", "Evening (6PM-12AM)", "Night (12AM-6AM)"]
+                selected_time = st.selectbox("Time of Day:", time_options)
+                # Map time selection to hour (middle of the range)
+                time_mapping = {
+                    "Morning (6AM-12PM)": 9,
+                    "Afternoon (12PM-6PM)": 15,
+                    "Evening (6PM-12AM)": 21,
+                    "Night (12AM-6AM)": 3
+                }
+                input_data["Hour"] = time_mapping[selected_time]
+                # Day of week
+                day_options = ["Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday"]
+                input_data["Day_of_week"] = st.selectbox("Day of Week:", day_options)
+                # Driver information
+                age_band_options = ['Under 18', '18-30', '31-50', 'Over 51']
+                input_data["Age_band_of_driver"] = st.selectbox("Driver Age Band:", age_band_options)
+                input_data["Sex_of_driver"] = st.selectbox("Driver Sex:", ['Male', 'Female'])
+                edu_options = ['Above high school', 'Junior high school', 'Elementary school',
+                              'High school', 'Writing & reading', 'Illiterate', 'Unknown']
+                input_data["Educational_level"] = st.selectbox("Educational Level:", edu_options)
+                relation_options = ['Employee', 'Owner', 'Other', 'Unknown']
+                input_data["Vehicle_driver_relation"] = st.selectbox("Vehicle-Driver Relation:", relation_options)
+                exp_options = ['No Licence', 'Below 1yr', '1-2yr', '2-5yr', '5-10yr', 'Above 10yr', 'Unknown']
+                input_data["Driving_experience"] = st.selectbox("Driving Experience:", exp_options)
+                # Pre-calculate Experience_Value for the model
+                experience_mapping = {
+                    'No Licence': 0, 'Below 1yr': 0.5, '1-2yr': 1.5,
+                    '2-5yr': 3.5, '5-10yr': 7.5, 'Above 10yr': 15,
+                    'Unknown': 5  # Default value for unknown
+                }
+                input_data["Experience_Value"] = experience_mapping[input_data["Driving_experience"]]
+                # Pre-calculate Age_Value for the model
+                age_mapping = {
+                    'Under 18': 16, '18-30': 24, '31-50': 40, 'Over 51': 60
+                }
+                input_data["Age_Value"] = age_mapping[input_data["Age_band_of_driver"]]
+            with col2:
+                # Vehicle information
+                vehicle_options = ['Automobile', 'Lorry (41?100Q)', 'Lorry (11?40Q)', 'Public (12 seats)',
+                                 'Public (13-45 seats)', 'Public (> 45 seats)', 'Motorcycle', 'Other']
+                input_data["Type_of_vehicle"] = st.selectbox("Vehicle Type:", vehicle_options)
+                owner_options = ['Owner', 'Governmental', 'Organization', 'Other']
+                input_data["Owner_of_vehicle"] = st.selectbox("Owner of Vehicle:", owner_options)
+                service_options = ['Unknown', '1-2yr', '2-5yr', '5-10yr', 'Above 10yr', 'Below 1yr']
+                input_data["Service_year_of_vehicle"] = st.selectbox("Service Year of Vehicle:", service_options)
+                defect_options = ['No defect', 'Defective tire', 'Defective break']
+                input_data["Defect_of_vehicle"] = st.selectbox("Vehicle Defect:", defect_options)
+                # Road and environment information
+                area_options = ['Other', 'Office areas', 'Residential areas', 'Rural village areas',
+                               'Church areas', 'School areas', 'Market areas', 'Hospital areas',
+                               'Industrial areas', 'Rural village areasOffice areas', 'Recreational areas',
+                               'Outside rural areas', 'Unknown', 'Rural village']
+                input_data["Area_accident_occured"] = st.selectbox("Area Accident Occurred:", area_options)
+                lanes_options = ['Two-way (divided with broken lines road marking)', 'Undivided Two way',
+                                'One way', 'Double carriageway (median)', 'Two-way (divided with solid lines road marking)',
+                                'Unknown', 'Other']
+                input_data["Lanes_or_Medians"] = st.selectbox("Lanes or Medians:", lanes_options)
+            with col3:
+                # More road information
+                road_align_options = ['Tangent road with flat terrain', 'Tangent road with mild grade and flat terrain',
+                                     'Steep grade downward with mountainous terrain', 'Escarpments',
+                                     'Tangent road with mountainous terrain and', 'Steep grade upward with mountainous terrain',
+                                     'Gentle horizontal curve', 'Sharp reverse curve', 'Tangent road with rolling terrain']
+                input_data["Road_allignment"] = st.selectbox("Road Alignment:", road_align_options)
+                junction_options = ['Y Shape', 'No junction', 'Other', 'Crossing', 'O Shape', 'Unknown', 'T Shape', 'X Shape']
+                input_data["Types_of_Junction"] = st.selectbox("Type of Junction:", junction_options)
+                surface_type_options = ['Asphalt roads', 'Earth roads', 'Gravel roads', 'Other', 'Asphalt roads with some distress']
+                input_data["Road_surface_type"] = st.selectbox("Road Surface Type:", surface_type_options)
+                surface_condition_options = ['Dry', 'Wet or damp', 'Snow', 'Flood over 3cm. deep']
+                input_data["Road_surface_conditions"] = st.selectbox("Road Surface Conditions:", surface_condition_options)
+                light_options = ['Daylight', 'Darkness - lights lit', 'Darkness - no lighting', 'Darkness - lights unlit']
+                input_data["Light_conditions"] = st.selectbox("Light Conditions:", light_options)
+                weather_options = ['Normal', 'Raining', 'Cloudy', 'Other', 'Raining and Windy',
+                                  'Fog or mist', 'Windy', 'Snow', 'Unknown']
+                input_data["Weather_conditions"] = st.selectbox("Weather Conditions:", weather_options)
+            # Additional column for more inputs
+            col4, col5, col6 = st.columns(3)
+            with col4:
+                # Collision and vehicle information
+                collision_options = ['Vehicle with vehicle collision', 'Collision with roadside objects',
+                                    'Collision with pedestrians', 'Rollover', 'Collision with animals',
+                                    'Collision with roadside-parked vehicles', 'Fall from vehicles',
+                                    'Other', 'Unknown', 'With Train']
+                input_data["Type_of_collision"] = st.selectbox("Type of Collision:", collision_options)
+                input_data["Number_of_vehicles_involved"] = st.number_input("Number of Vehicles Involved:",
+                                                                       min_value=1, max_value=10, value=2)
+                input_data["Number_of_casualties"] = st.number_input("Number of Casualties:",
+                                                                min_value=1, max_value=10, value=1)
+                movement_options = ['Going straight', 'Moving Backward', 'U-Turn', 'Other', 'Reversing',
+                                   'Parked', 'Waiting to go', 'Getting off', 'Overtaking', 'Unknown',
+                                   'Stopping', 'Changing lane to the right', 'Changing lane to the left']
+                input_data["Vehicle_movement"] = st.selectbox("Vehicle Movement:", movement_options)
+            with col5:
+                # Casualty information
+                casualty_class_options = ['Driver or rider', 'na', 'Pedestrian', 'Passenger']
+                input_data["Casualty_class"] = st.selectbox("Casualty Class:", casualty_class_options)
+                sex_casualty_options = ['Male', 'na', 'Female']
+                input_data["Sex_of_casualty"] = st.selectbox("Sex of Casualty:", sex_casualty_options)
+                age_casualty_options = ['na', '18-30', '31-50', 'Over 51', 'Under 18', '5']
+                input_data["Age_band_of_casualty"] = st.selectbox("Age Band of Casualty:", age_casualty_options)
+                casualty_severity_options = ['3', 'na', '2', '1']
+                input_data["Casualty_severity"] = st.selectbox("Casualty Severity:", casualty_severity_options)
+            with col6:
+                # Final inputs
+                fitness_options = ['Normal', 'With infirmity', 'Alcohol', 'Illness', 'Asleep or Fatigued']
+                input_data["Fitness_of_casuality"] = st.selectbox("Fitness of Casualty:", fitness_options)
+                pedestrian_options = ['Not a Pedestrian',
+                                     'Crossing from nearside - masked by parked or statioNot a Pedestrianry vehicle',
+                                     'Unknown or other', 'In carriageway, stationary - not crossing',
+                                     'Walking along in carriageway, back to traffic',
+                                     'Crossing from nearside', 'Crossing from offside',
+                                     'Walking along in carriageway, facing traffic',
+                                     'Playing in carriageway']
+                input_data["Pedestrian_movement"] = st.selectbox("Pedestrian Movement:", pedestrian_options)
+                cause_options = ['No distancing', 'Changing lane to the right', 'Driving carelessly',
+                               'No priority to vehicle', 'Moving Backward', 'No priority to pedestrian',
+                               'Other', 'Overtaking', 'Driving under the influence of drugs',
+                               'Driving to the left', 'Getting off the vehicle improperly',
+                               'Driving at high speed', 'Overturning', 'Turnover', 'Overspeed',
+                               'Overloading', 'Drunk driving', 'Unknown', 'Improper parking',
+                               'Driving on the wrong side of the road']
+                input_data["Cause_of_accident"] = st.selectbox("Cause of Accident:", cause_options)
+                work_casualty_options = ['Driver', 'Other', 'Unemployed', 'Employee', 'Self-employed', 'Student', 'Unknown']
+                input_data["Work_of_casuality"] = st.selectbox("Work of Casualty:", work_casualty_options)
+            # Calculate derived features needed by the model
+            # Casualty to vehicle ratio
+            input_data["Casualty_to_vehicle_ratio"] = input_data["Number_of_casualties"] / input_data["Number_of_vehicles_involved"]
+            # Driver Risk Score
+            normalized_age_risk = 1 - (input_data["Age_Value"] / 60)  # Assuming 60 is max age
+            normalized_exp_risk = 1 - (input_data["Experience_Value"] / 15)  # Assuming 15 is max experience
+            input_data["Driver_Risk_Score"] = (normalized_age_risk + normalized_exp_risk) / 2
+            # Environmental risk factors
+            weather_risk = {
+                'Normal': 0.2, 'Raining': 0.7, 'Cloudy': 0.4, 'Windy': 0.5,
+                'Snow': 0.8, 'Fog or mist': 0.9, 'Raining and Windy': 0.8,
+                'Other': 0.5, 'Unknown': 0.5
+            }
+            light_risk = {
+                'Daylight': 0.2, 'Darkness - lights lit': 0.5,
+                'Darkness - no lighting': 0.9, 'Darkness - lights unlit': 0.8
+            }
+            input_data["Weather_Risk"] = weather_risk.get(input_data["Weather_conditions"], 0.5)
+            input_data["Light_Risk"] = light_risk.get(input_data["Light_conditions"], 0.5)
+            input_data["Environmental_Risk"] = (input_data["Weather_Risk"] + input_data["Light_Risk"]) / 2
+            # Add Is_Weekend (assuming Python's datetime conventions where Monday is 0 and Sunday is 6)
+            day_to_num = {
+                'Monday': 0, 'Tuesday': 1, 'Wednesday': 2, 'Thursday': 3,
+                'Friday': 4, 'Saturday': 5, 'Sunday': 6
+            }
+            day_num = day_to_num.get(input_data["Day_of_week"], 0)
+            input_data["Is_Weekend"] = 1 if day_num >= 5 else 0
+            # Add Is_Night
+            input_data["Is_Night"] = 1 if (input_data["Hour"] >= 18 or input_data["Hour"] < 6) else 0
+            # Check for missing expected features and add defaults
+            for feature in expected_features:
+                if feature not in input_data:
+                    if feature.startswith("Number_"):
+                        input_data[feature] = 0  # Default value for numerical features
+                    else:
+                        input_data[feature] = "Unknown"  # Default value for categorical features
+            if st.button("Predict Accident Severity"):
+                try:
+                    # Create a DataFrame from the input data with matching features
+                    input_df = pd.DataFrame([input_data])
+                    # Filter to include only expected features in the right order
+                    if expected_features:
+                        # Create a DataFrame with the same structure as what the model expects
+                        input_df_filtered = pd.DataFrame(columns=expected_features)
+                        # Fill in values from our input data
+                        for feature in expected_features:
+                            if feature in input_data:
+                                input_df_filtered[feature] = [input_data[feature]]
+                            else:
+                                # Use a default value
+                                input_df_filtered[feature] = [0]
+                        # Use the filtered DataFrame for prediction
+                        input_df = input_df_filtered
+                    # Encode categorical features with silent error handling
+                    for col in input_df.columns:
+                        if col in label_encoders and isinstance(input_df[col].iloc[0], str):
+                            try:
+                                le = label_encoders[col]
+                                # Check if the value exists in the label encoder
+                                if input_df[col].iloc[0] in le.classes_:
+                                    input_df[col] = le.transform(input_df[col])
+                                else:
+                                    # Silently handle unknown values
+                                    most_common_class = le.classes_[0]
+                                    input_df[col] = le.transform([most_common_class])
+                            except Exception as e:
+                                # Use a fallback value silently
+                                input_df[col] = 0
+                    # Make prediction
+                    prediction = model.predict(input_df)[0]
+                    # Map prediction back to class label
+                    severity_mapping = {0: 'Slight Injury', 1: 'Serious Injury', 2: 'Fatal Injury'}
+                    predicted_severity = severity_mapping.get(prediction, str(prediction))
+                    # Show prediction with styling based on severity
+                    severity_color = {
+                        'Slight Injury': 'green',
+                        'Serious Injury': 'orange',
+                        'Fatal Injury': 'red'
+                    }
+                    color = severity_color.get(predicted_severity, 'blue')
+                    st.markdown(f"""
+                    <div style="background-color: {color}; padding: 20px; border-radius: 10px; text-align: center;">
+                        <h2 style="color: white;">Predicted Accident Severity</h2>
+                        <h1 style="color: white;">{predicted_severity}</h1>
+                    </div>
+                    """, unsafe_allow_html=True)
+                    # Show prediction probability if available
+                    if hasattr(model, 'predict_proba'):
+                        try:
+                            probabilities = model.predict_proba(input_df)[0]
+                            st.subheader("Prediction Confidence")
+                            proba_df = pd.DataFrame({
+                                'Severity': list(severity_mapping.values())[:len(probabilities)],
+                                'Probability': probabilities
+                            })
+                            fig = px.bar(proba_df, x='Severity', y='Probability',
+                                        color='Severity', color_discrete_map=severity_color)
+                            st.plotly_chart(fig, use_container_width=True)
+                        except Exception as e:
+                            st.error(f"Error displaying probabilities: {e}")
+                except Exception as e:
+                    st.error(f"Prediction error: {e}")
+# Run the app
+if __name__ == "__main__":
+    main()

best_accident_severity_model.pkl ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:ea1e3aeb07c95a9d32d9b339509516d01b15d3509af26ca1fd4b1205d7998f36
+size 3025761

huggingface-metadata.yml ADDED Viewed

	@@ -0,0 +1,3 @@

+sdk_version: 3.0.0
+app_file: app.py
+pinned: false

label_encoders.pkl ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:7ea390bb333b7f2f616f31fcf67401b5350abd11e086f93b8e5bbde6ed493e0e
+size 30581

requirements.txt ADDED Viewed

	@@ -0,0 +1,9 @@

+streamlit==1.27.0
+pandas==1.5.3
+numpy==1.24.3
+plotly==5.15.0
+joblib==1.2.0
+scikit-learn==1.2.2
+cufflinks==0.17.3
+matplotlib==3.7.2
+scipy==1.10.1

scaler.pkl ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:560f303d719d65a8a8505ddb2d83814fa57acbc4f082b8ba22e0e0548a9206f8
+size 2727