🫁 Datathon Lung Cancer Detector

This model predicts whether a patient is likely to have lung cancer based on clinical and behavioral risk factors.
It was trained on a dataset of 309 entries with 15 input features and a binary diagnosis label.


πŸ“Š Input Features

Feature Type Description
GENDER 0 = Female, 1 = Male Biological sex
AGE Integer Patient age
SMOKING 0/1 Smoking habit
YELLOW_FINGERS 0/1 Stained fingers from smoking
ANXIETY 0/1 Anxiety symptoms
PEER_PRESSURE 0/1 Influence from peers
CHRONIC DISEASE 0/1 History of chronic illness
FATIGUE 0/1 Feeling of tiredness
ALLERGY 0/1 Known allergies
WHEEZING 0/1 Wheezing symptoms
ALCOHOL CONSUMING 0/1 Alcohol consumption
COUGHING 0/1 Persistent coughing
SHORTNESS OF BREATH 0/1 Difficulty breathing
SWALLOWING DIFFICULTY 0/1 Trouble swallowing
CHEST PAIN 0/1 Pain in chest area

🧠 Model Info

  • Algorithm: XG Boost Classifier(Highest Score)
  • Framework: Scikit-learn
  • Target: DIAGNOSIS_LUNG_CANCER (YES = Lung Cancer, NO = No Cancer)
  • Dataset Size: 309 samples
  • Preprocessing: Label encoding, binary encoding for yes/no inputs

πŸš€ Try It in Streamlit

This model is also available as a web app built using [Streamlit]. Access on https://datathonlungcancer.streamlit.app/

import streamlit as st
import pandas as pd
import joblib

model = joblib.load('model.pkl')

st.title('🫁 Lung Cancer Diagnosis')
st.write("Please fill out the following information to assess the likelihood of lung cancer.")

gender = st.selectbox('Gender', [0, 1], format_func=lambda x: "Female" if x == 0 else "Male")
age = st.number_input('Age', max_value=120, value=0)
smoking = st.selectbox('Smoking', ['Yes', 'No'])
yellow_fingers = st.selectbox('Yellow Fingers', ['Yes', 'No'])
anxiety = st.selectbox('Anxiety', ['Yes', 'No'])
peer_pressure = st.selectbox('Peer Pressure', ['Yes', 'No'])
chronic_disease = st.selectbox('Chronic Disease', ['Yes', 'No'])
fatigue = st.selectbox('Fatigue', ['Yes', 'No'])
allergy = st.selectbox('Allergy', ['Yes', 'No'])
wheezing = st.selectbox('Wheezing', ['Yes', 'No'])
alcohol = st.selectbox('Alcohol Consuming', ['Yes', 'No'])
coughing = st.selectbox('Coughing', ['Yes', 'No'])
shortness_of_breath = st.selectbox('Shortness of Breath', ['Yes', 'No'])
swallowing_difficulty = st.selectbox('Swallowing Difficulty', ['Yes', 'No'])
chest_pain = st.selectbox('Chest Pain', ['Yes', 'No'])

def binary_encode(value):
    return 1 if value == 'Yes' else 0

data = pd.DataFrame([[gender, age,
                      binary_encode(smoking),
                      binary_encode(yellow_fingers),
                      binary_encode(anxiety),
                      binary_encode(peer_pressure),
                      binary_encode(chronic_disease),
                      binary_encode(fatigue),
                      binary_encode(allergy),
                      binary_encode(wheezing),
                      binary_encode(alcohol),
                      binary_encode(coughing),
                      binary_encode(shortness_of_breath),
                      binary_encode(swallowing_difficulty),
                      binary_encode(chest_pain)]],
                    columns=['GENDER', 'AGE', 'SMOKING', 'YELLOW_FINGERS', 'ANXIETY',
                             'PEER_PRESSURE', 'CHRONIC DISEASE', 'FATIGUE', 'ALLERGY',
                             'WHEEZING', 'ALCOHOL CONSUMING', 'COUGHING',
                             'SHORTNESS OF BREATH', 'SWALLOWING DIFFICULTY', 'CHEST PAIN'])

if st.button('Predict'):
    prediction = model.predict(data)[0]
    if prediction == 1:
        st.error("⚠️ High risk of lung cancer. Please consult a doctor.")
    else:
        st.success("βœ… No Lung Cancer.")
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support