About Candidate

Location

Education

P
Ph.D 2008
Boston University
M
M.S. 2001
Nanjing University
B
B.S. 1998
Lanzhou University

Work & Experience

P
Principal Data Scientist (VP) 05/2023 - 12/2025
Citizens Bank

 Led the development of a new loan pricing optimization (LPO) system for the Education Refinance Loan (ERL) portfolio, replacing the legacy FICO LPO platform and eliminating $500K in annual subscription cost. partnered closely with pricing team and finance stakeholders to ensure the solution aligned with business strategy and operational needs.
 This LPO implements a highly complex optimization engine that takes pricing elasticity model, volume-foretasting, loan Profit &Loss model, and integrates regulatory, risk, profitability, and competitive constraints into a unified framework. The system optimizes high-dimensional, interdependent parameters to generate stable, scenario-specific pricing solutions, supports diverse business strategies and improves the return–volume tradeoff while running efficiently on desktop-level compute (~10 minutes). Its modular architecture allows easy extension to additional loan products.
 Developed end-to-end ML solutions—from data extraction and feature engineering to model training, validation, deployment, and drift monitoring—for loan-pricing elasticity, credit-card delinquency, and ERL retention (Logistic Regression, XGBoost), improving marketing strategies and strengthening risk management.

M
Manage Data Science 06/2021 - 02/2023
Publicis Sapient AI labs

 Built CatBoost models to infer demographic attributes from noisy, unstructured inputs—handwriting, scanned images, PDFs—directly supporting a US government agency’s consumer product safety report data quality initiative and collaborating closely with policy, compliance, and technical stakeholders to align model outputs with operational needs.
 Developed CatBoost time-series models for customer cash-flow forecasting for a major U.S. bank, partnering with product, risk, and finance teams to translate modeling insights into actionable improvements for customer financial planning and risk management.
 Built Market Mix Models to evaluate and optimize customer incentive programs for a national retailer, communicated findings to marketing and executive teams, enabling data-driven pricing decisions and improved promotional effectiveness.

S
Sr. Data Scientist 06/2019 - 03/2021
Fidelity Investments

 Built core models for an automated equity execution system under noisy market conditions. Because each trade provides only one realized execution path, but the optimizer must evaluate 1,000+ counterfactual paths, designed a two-model framework: a rich pre-trade model and a generalizable post-trade model to estimate costs across all hypothetical strategies. Integrated into a cloud-based optimizer that delivered ~$30M/year savings (~1 bps) over the legacy system.
 Ensured model performance under evolving market conditions by monitoring drift and recalibrating models, working closely with data engineering and business teams to deliver reliable insights that support customer-focused decision-making.

S
Sr. Data Scientist (Scrum Master) 04/2018 - 06/2019
NetBrain Tech. Inc

 Led a team of five data scientists and collaborated closely with cross-functional stakeholders to design, build, validate, and deploy full-stack ML models for network anomaly detection. Delivered significant improvements in detection accuracy using LSTM, CNN, ARIMA, Holt-Winters, PCA, and statistical modeling, while mentoring junior staff and establishing best practices for model monitoring and drift management.
 Implemented word embedding and CNN, RNN to analyze raw network text logs, enhancing anomaly detection and increasing troubleshooting efficiency.
 Constructed correlation-analysis frameworks that facilitated faster root cause identification and improved network troubleshooting capabilities.
 Developed a multi-modal cross-devices anomaly detection system using Isolation Forest and statistical methods, which increased AUC by ~20% and broadened detection coverage in a simulated network environment

S
Sr. Data Scientist 06/2016 - 03/2018
Fidelity National Information Services (FIS)

 Developed and delivered full-stack fraud detection systems for ACH payments using logistic regression and AdaBoost, which enhanced fraud detection accuracy and reduced false positives.
 Applied advanced signal processing techniques—including Kalman filtering, wavelet transforms, Fourier spectral analysis, ARIMA modeling, and time-series decomposition—to extract robust fraud-detection signals from noisy customer data, enabling higher-precision ML classification with Platt-scaled probabilities.
 Invented “influence diagrams” to visualize real AdaBoost model behavior, diagnose overfitting dimensions, and guide training improvements—leading to ~100% increased fraud dollar detection. Work accepted for presentation at ODSC East 2018.

Skills

machine learning
99%
deep learning
98%
AI, LLM
85%
data science, data analysis, data engineering
99%
Python, SQL
99%
AWS cloud, google cloud, azure
95%
time-series forecasting, fraud detection, anomaly detection
99%