Authors - S. B. Hema Anjali, Manikanta Sai Sumeeth, Sushama Rani Dutta Abstract - This study makes use of a machine learning system that predicts health insurance costs, a relevant issue given the increasing need for such estimates in a post-COVID-19 world. Using the Medical Cost Personal Dataset available at Kaggle offering 1,338 entries, we applied various ensemble models, notably XGBoost, Gradient Boosting Machine (GBM), Random Forest, and Support Vector Machines (SVM). Among our results, XGBoost gives out the best accuracy of the estimates, but the implementation of this technique was expensive. Random Forest was non-intrusive and went on to be of high efficacy. We also discussed how the big data paradigm was implemented using Spark as a means to enhance performance in working on large datasets. As a whole, this work positions XGBoost the ban for the cost of health insurance prediction claiming that there exists scope for improvement by deploying ML methods in decision making in healthcare processes.