π« Datathon Lung Cancer Detector
This model predicts whether a patient is likely to have lung cancer based on clinical and behavioral risk factors.
It was trained on a dataset of 309 entries with 15 input features and a binary diagnosis label.
π Input Features
Feature | Type | Description |
---|---|---|
GENDER |
0 = Female, 1 = Male | Biological sex |
AGE |
Integer | Patient age |
SMOKING |
0/1 | Smoking habit |
YELLOW_FINGERS |
0/1 | Stained fingers from smoking |
ANXIETY |
0/1 | Anxiety symptoms |
PEER_PRESSURE |
0/1 | Influence from peers |
CHRONIC DISEASE |
0/1 | History of chronic illness |
FATIGUE |
0/1 | Feeling of tiredness |
ALLERGY |
0/1 | Known allergies |
WHEEZING |
0/1 | Wheezing symptoms |
ALCOHOL CONSUMING |
0/1 | Alcohol consumption |
COUGHING |
0/1 | Persistent coughing |
SHORTNESS OF BREATH |
0/1 | Difficulty breathing |
SWALLOWING DIFFICULTY |
0/1 | Trouble swallowing |
CHEST PAIN |
0/1 | Pain in chest area |
π§ Model Info
- Algorithm: XG Boost Classifier(Highest Score)
- Framework: Scikit-learn
- Target:
DIAGNOSIS_LUNG_CANCER
(YES
= Lung Cancer,NO
= No Cancer) - Dataset Size: 309 samples
- Preprocessing: Label encoding, binary encoding for yes/no inputs
π Try It in Streamlit
This model is also available as a web app built using [Streamlit]. Access on https://datathonlungcancer.streamlit.app/
import streamlit as st
import pandas as pd
import joblib
model = joblib.load('model.pkl')
st.title('π« Lung Cancer Diagnosis')
st.write("Please fill out the following information to assess the likelihood of lung cancer.")
gender = st.selectbox('Gender', [0, 1], format_func=lambda x: "Female" if x == 0 else "Male")
age = st.number_input('Age', max_value=120, value=0)
smoking = st.selectbox('Smoking', ['Yes', 'No'])
yellow_fingers = st.selectbox('Yellow Fingers', ['Yes', 'No'])
anxiety = st.selectbox('Anxiety', ['Yes', 'No'])
peer_pressure = st.selectbox('Peer Pressure', ['Yes', 'No'])
chronic_disease = st.selectbox('Chronic Disease', ['Yes', 'No'])
fatigue = st.selectbox('Fatigue', ['Yes', 'No'])
allergy = st.selectbox('Allergy', ['Yes', 'No'])
wheezing = st.selectbox('Wheezing', ['Yes', 'No'])
alcohol = st.selectbox('Alcohol Consuming', ['Yes', 'No'])
coughing = st.selectbox('Coughing', ['Yes', 'No'])
shortness_of_breath = st.selectbox('Shortness of Breath', ['Yes', 'No'])
swallowing_difficulty = st.selectbox('Swallowing Difficulty', ['Yes', 'No'])
chest_pain = st.selectbox('Chest Pain', ['Yes', 'No'])
def binary_encode(value):
return 1 if value == 'Yes' else 0
data = pd.DataFrame([[gender, age,
binary_encode(smoking),
binary_encode(yellow_fingers),
binary_encode(anxiety),
binary_encode(peer_pressure),
binary_encode(chronic_disease),
binary_encode(fatigue),
binary_encode(allergy),
binary_encode(wheezing),
binary_encode(alcohol),
binary_encode(coughing),
binary_encode(shortness_of_breath),
binary_encode(swallowing_difficulty),
binary_encode(chest_pain)]],
columns=['GENDER', 'AGE', 'SMOKING', 'YELLOW_FINGERS', 'ANXIETY',
'PEER_PRESSURE', 'CHRONIC DISEASE', 'FATIGUE', 'ALLERGY',
'WHEEZING', 'ALCOHOL CONSUMING', 'COUGHING',
'SHORTNESS OF BREATH', 'SWALLOWING DIFFICULTY', 'CHEST PAIN'])
if st.button('Predict'):
prediction = model.predict(data)[0]
if prediction == 1:
st.error("β οΈ High risk of lung cancer. Please consult a doctor.")
else:
st.success("β
No Lung Cancer.")
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support