Tourism Package Prediction Model

Model Description

This model predicts whether a customer will purchase a Wellness Tourism Package from "Visit with Us" travel company. It uses XGBoost classifier with a custom preprocessing pipeline to handle both numeric and categorical features.

Intended Use

Primary Use: Identify potential customers for the Wellness Tourism Package to optimize marketing outreach and improve conversion rates.

Users: Sales and marketing teams at travel companies.

Out-of-scope: This model should not be used for discriminatory purposes or decisions that could significantly impact individuals' lives beyond marketing preferences.

Training Data

Dataset: Tourism package purchase history
Features: 18 features including customer demographics, travel preferences, and sales interaction data
- 12 numeric features (Age, CityTier, MonthlyIncome, etc.)
- 6 categorical features (Gender, Occupation, Designation, etc.)
Target: Binary classification (ProdTaken: 0 = No purchase, 1 = Purchase)
Training Set: 3302 samples
Test Set: 826 samples
Class Imbalance: Handled using scale_pos_weight parameter

Model Architecture

Algorithm: XGBoost Classifier with sklearn preprocessing pipeline

Preprocessing:

Numeric features: Passthrough (no transformation)
Nominal categorical features: OrdinalEncoder
Ordinal feature (Designation): OrdinalEncoder with hierarchy (Executive → Manager → Senior Manager → AVP → VP)

Best Hyperparameters:

colsample_bylevel: 0.6
colsample_bytree: 0.6
learning_rate: 0.15
max_depth: 5
n_estimators: 250
reg_lambda: 0.5

Classification Threshold: 0.45 (optimized for F1-score)

Performance Metrics

Training Set

Accuracy: 0.9942
Precision: 0.9711
Recall: 1.0000
F1-Score: 0.9853

Test Set

Accuracy: 0.9286
Precision: 0.8165
Recall: 0.8113
F1-Score: 0.8139

How to Use

import joblib
import pandas as pd
from huggingface_hub import hf_hub_download

# Download the model
model_path = hf_hub_download(
    repo_id="nsriram78/tourism-package-prediction",
    filename="tourism_conversion_predict_model.joblib",
    repo_type="model"
)

# Load the model
model = joblib.load(model_path)

# Prepare input data (must match training feature order)
input_data = pd.DataFrame([{
    'Age': 35,
    'CityTier': 1,
    'DurationOfPitch': 15,
    'NumberOfPersonVisiting': 3,
    'NumberOfFollowups': 3,
    'PreferredPropertyStar': 4.0,
    'NumberOfTrips': 3,
    'Passport': 1,
    'PitchSatisfactionScore': 3,
    'OwnCar': 1,
    'NumberOfChildrenVisiting': 1,
    'MonthlyIncome': 22000,
    'TypeofContact': 'Self Enquiry',
    'Occupation': 'Salaried',
    'Gender': 'Male',
    'ProductPitched': 'Basic',
    'MaritalStatus': 'Married',
    'Designation': 'Manager'
}])

# Get prediction probability
prediction_proba = model.predict_proba(input_data)[0, 1]

# Apply custom threshold
prediction = (prediction_proba >= 0.45).astype(int)

print(f"Purchase Probability: {prediction_proba:.2%}")
print(f"Prediction: {'Will Purchase' if prediction == 1 else 'Will Not Purchase'}")

Training Procedure

Data Preparation: 80/20 train-test split with stratification
Hyperparameter Tuning: GridSearchCV with 5-fold cross-validation
Optimization Metric: F1-Score (to balance precision and recall)
Experiment Tracking: MLflow for logging parameters and metrics

Limitations and Considerations

The model is trained on historical data and may not generalize to significantly different customer populations
Performance depends on data quality and feature completeness
Class imbalance handled but may still affect predictions on minority class
Custom threshold of 0.45 optimized for current dataset; may need adjustment for different use cases
Model assumes input features are in the exact order and format as training data

Ethical Considerations

Ensure model is used responsibly for marketing purposes only
Regularly monitor for bias in predictions across different demographic groups
Respect customer privacy and comply with data protection regulations
Provide opt-out mechanisms for customers who don't wish to be contacted

Model Card Authors

Sriram Narasimhan

Model Card Contact

For questions or issues, please open an issue in the model repository.

Downloads last month: -; Downloads are not tracked for this model. How to track

nsriram78
/

tourism-package-prediction