Mism 258 [verified] -

df['distance_km'] = df.apply(calculate_distance, axis=1)

# MISM 258 standard ETL snippet import pandas as pd from geopy.distance import geodesic def calculate_distance(row): origin = (row['origin_lat'], row['origin_lon']) dest = (row['dest_lat'], row['dest_lon']) return geodesic(origin, dest).km mism 258

Subject: Final Project Analysis: Optimizing Operational Efficiency through Predictive Modeling Date: [Current Date] Prepared for: Professor [Name], Heinz College Prepared by: [Your Name/Team Name] 1. Executive Summary This report synthesizes the core methodologies and outcomes of the MISM 258 (Data Analytics & Business Intelligence) capstone project. Using a real-world dataset from a mid-sized e-commerce logistics firm, we applied predictive modeling (Logistic Regression & Random Forest) to forecast shipment delays. The key finding indicates that implementing the proposed Random Forest model can reduce misclassification costs by 22% compared to the company’s current heuristic model. Recommendations include integrating real-time weather data into the feature set and retraining the model bi-weekly. 2. Introduction 2.1. Course Context MISM 258 focuses on the end-to-end process of business intelligence: from data warehousing and ETL (Extract, Transform, Load) to advanced analytics and dashboard visualization. The core tenet is transforming raw data into actionable strategic assets. df['distance_km'] = df