Minghai Li
About Candidate
Location
Education
Work & Experience
Led the development of a new loan pricing optimization (LPO) system for the Education Refinance Loan (ERL) portfolio, replacing the legacy FICO LPO platform and eliminating $500K in annual subscription cost. partnered closely with pricing team and finance stakeholders to ensure the solution aligned with business strategy and operational needs.
This LPO implements a highly complex optimization engine that takes pricing elasticity model, volume-foretasting, loan Profit &Loss model, and integrates regulatory, risk, profitability, and competitive constraints into a unified framework. The system optimizes high-dimensional, interdependent parameters to generate stable, scenario-specific pricing solutions, supports diverse business strategies and improves the return–volume tradeoff while running efficiently on desktop-level compute (~10 minutes). Its modular architecture allows easy extension to additional loan products.
Developed end-to-end ML solutions—from data extraction and feature engineering to model training, validation, deployment, and drift monitoring—for loan-pricing elasticity, credit-card delinquency, and ERL retention (Logistic Regression, XGBoost), improving marketing strategies and strengthening risk management.
Built CatBoost models to infer demographic attributes from noisy, unstructured inputs—handwriting, scanned images, PDFs—directly supporting a US government agency’s consumer product safety report data quality initiative and collaborating closely with policy, compliance, and technical stakeholders to align model outputs with operational needs.
Developed CatBoost time-series models for customer cash-flow forecasting for a major U.S. bank, partnering with product, risk, and finance teams to translate modeling insights into actionable improvements for customer financial planning and risk management.
Built Market Mix Models to evaluate and optimize customer incentive programs for a national retailer, communicated findings to marketing and executive teams, enabling data-driven pricing decisions and improved promotional effectiveness.
Built core models for an automated equity execution system under noisy market conditions. Because each trade provides only one realized execution path, but the optimizer must evaluate 1,000+ counterfactual paths, designed a two-model framework: a rich pre-trade model and a generalizable post-trade model to estimate costs across all hypothetical strategies. Integrated into a cloud-based optimizer that delivered ~$30M/year savings (~1 bps) over the legacy system.
Ensured model performance under evolving market conditions by monitoring drift and recalibrating models, working closely with data engineering and business teams to deliver reliable insights that support customer-focused decision-making.
Led a team of five data scientists and collaborated closely with cross-functional stakeholders to design, build, validate, and deploy full-stack ML models for network anomaly detection. Delivered significant improvements in detection accuracy using LSTM, CNN, ARIMA, Holt-Winters, PCA, and statistical modeling, while mentoring junior staff and establishing best practices for model monitoring and drift management.
Implemented word embedding and CNN, RNN to analyze raw network text logs, enhancing anomaly detection and increasing troubleshooting efficiency.
Constructed correlation-analysis frameworks that facilitated faster root cause identification and improved network troubleshooting capabilities.
Developed a multi-modal cross-devices anomaly detection system using Isolation Forest and statistical methods, which increased AUC by ~20% and broadened detection coverage in a simulated network environment
Developed and delivered full-stack fraud detection systems for ACH payments using logistic regression and AdaBoost, which enhanced fraud detection accuracy and reduced false positives.
Applied advanced signal processing techniques—including Kalman filtering, wavelet transforms, Fourier spectral analysis, ARIMA modeling, and time-series decomposition—to extract robust fraud-detection signals from noisy customer data, enabling higher-precision ML classification with Platt-scaled probabilities.
Invented “influence diagrams” to visualize real AdaBoost model behavior, diagnose overfitting dimensions, and guide training improvements—leading to ~100% increased fraud dollar detection. Work accepted for presentation at ODSC East 2018.