Jorge Guerra

Hello, I'm

Jorge Guerra

I've spent the last 10+ years building ML systems, data platforms, and products that solve real problems — from saving banks millions with predictive models to deploying cardiovascular risk tools used by 500K+ people. My work spans machine learning, NLP, full-stack development, and mobile apps across banking, healthcare, research, and tech startups. I'm equally comfortable building frontend interfaces, designing backend architectures, writing Python pipelines, or automating complex workflows end-to-end. What I enjoy most is collaborating with cross-functional teams to turn complex data challenges into products that actually deliver. I'm also available for consulting — if you have a project in mind, let's talk.

Experience

10+ years across banking, healthcare, research, technology, and transportation industries.

Co-Founder & CTO

Nov 2025 – Present

Onkopilot

San Juan, PR

  • Co-founded AI-powered oncology support platform for cancer patients and caregivers
  • Built the marketing website and currently developing the web application using a RAG (Retrieval-Augmented Generation) system for personalized oncology guidance
  • Leading technical architecture leveraging LLMs and medical knowledge bases for accurate, empathetic patient support
PythonNext.jsRAGLLMsLangChainFirebaseNLP

Co-Founder & CTO

Apr 2025 – Present

Xsporty

San Juan, PR

  • Co-founded sports platform connecting athletes to find games and build community
  • Developed the full product stack: web app, iOS app, Android app, and all backend services
  • Architected and built real-time game matching, user networking, and scheduling systems from the ground up
PythonFlutterDartFirebaseCloud FunctionsiOSAndroidWeb

Data Scientist Program Lead

Jun 2023 – Present

Popular Bank

San Juan, PR

  • Co-designed, implemented, and managed a cutting-edge training program leveraging Microsoft Power Platform and Python to enhance workforce expertise in process automation, data analytics, and data science
  • Led Data Analytics and Data Science Programs, overseeing end-to-end analytics workflows — defining success criteria, generating insights, and delivering actionable recommendations
  • Managed performance metrics, aligned cross-functional stakeholders, and developed interactive dashboards and reports for data-driven decision-making
  • Designed scalable analytics solutions, built robust data models, and implemented machine learning models to optimize business operations and drive efficiency
  • Fostered a data-driven culture, standardizing methodologies to assess project value, process improvements, and strategic investments across teams
  • Unclaimed Property — Saved $58M by developing a tool to assess customer behaviors and classify claimed vs unclaimed properties data to accurately provide government clients correct information
  • Loan Portfolio Valuation — Saved $300K annually by implementing advanced data processing and optimization techniques for portfolio asset valuation, forecasting, and amortization
  • Customer Lifetime Value Prediction — Developed ML models to optimize marketing campaigns, predicting short- and long-term customer value across diverse segments
  • ATH Mobile Alert Automation — Built an ML model for AML alert classification, distinguishing between SAR and No SAR, reducing analyst workload by 10X
PythonLightGBMXGBoostSQLAzureDatabricksPower PlatformMachine Learning

Co-Founder

Nov 2022 – May 2023

MYH Solutions LLC

Dallas, TX

  • Designed, developed, and managed the universal order system for Virtual Care Coordinators (VCC) implemented in outpatient neurology clinics
  • Leveraged automation and implemented data entry improvements to analyze, report, and enhance the clinics' operational digital processes
  • Streamlined workflows and increased efficiency, empowering Virtual Care Coordinators to provide more effective and timely services to patients
PythonApplied Machine LearningProject ManagementProcess Automation

CTO

2021 – 2025

My Dental Path

San Juan, PR

  • Led all technical development for a platform empowering internationally trained dentists to navigate US dental school admissions
  • Built and maintained the website including service pages, bundled package flows, and content management
  • Managed the full technology stack from front-end design to hosting and deployment
Web DevelopmentUI/UX DesignSEOResponsive DesignFirebase

Data Scientist

Mar 2021 – May 2022

One Drop

Remote

  • Performed research and development of a cardiovascular diseases risk model, augmenting a well-known categorical risk score to include continuous measurements such as blood glucose, blood pressure, BMI, weight variability, and activity calories to estimate CV risk score and its variability over time for people with diabetes
  • Worked on ranking of diabetes-related measurements to find and measure their importance and association between different physiological events
  • Designed and developed an insights system to provide useful, timely, and accurate information about different day patterns to improve communication and services between coaches and customers
PythonConstrained OptimizationStatistical ModelingAWSData Analysis

Senior Data Scientist

Sep 2019 – Feb 2021

KPMG US

Dallas-Fort Worth, TX

  • Developed a process automation tool to obtain, analyze, and provide real-time insight into the audit of medical records using software development, computer vision, and NLP, automating tasks while improving time, accuracy, and consistency of record management processes
  • Developed and implemented a computer vision model to detect and process assets' tags to improve workflow in warehouse management
PythonC#NLPOCRComputer VisionAzureSQL Server

Data Scientist

Aug 2017 – Aug 2019

Children's Hospital of Philadelphia

Philadelphia, PA

  • Technical lead of a group (two staff members and four students) which designed, developed, and deployed a recommendation system analyzing CHOP and U-Penn researchers' publications and staff skill sets to provide meaningful collaborative recommendations — currently in production
  • Technical lead of a clinical research group which designed and developed an algorithm to predict the likelihood of patients missing their appointments — won The Drexel LeBow Analytics 50 Award
  • Developed and deployed automation scripts to support ETL of data from IBM Netezza CHOP Data Warehouse prior to machine learning model analysis
  • Served as liaison between the Advanced Analytics Department and hospital staff requesting data analysis, data modeling, and information visualization
PythonTensorFlowNLPSVMNeural NetworksRSQL

Graduate Researcher — Computational Biology

Jan 2017 – May 2017

Columbia University

New York, NY

  • Conducted research in human genetics, developing a model to provide patient information based on variants from exome sequencing data using read depth information
  • Investigated the value of imputation from exome sequencing to capture lower frequency variants in isolated populations to identify mutations that may increase the risk of common diseases
  • Developed quality control tools to preprocess data and run different methods of burden test to classify association to specific traits
PythonBioinformaticsStatistical GeneticsData Analysis

Graduate Researcher — Robotics Lab

Sep 2015 – May 2017

Columbia University

New York, NY

  • Performed research in learning and classification of IMU sensor data using logistic regression, Hidden Markov Models and other ML techniques
  • Developed a system to accurately predict and quantify the type and accuracy of upper body primitive motions of stroke patients
  • Published: Capture, Learning, and Classification of Upper Extremity Movement Primitives, IEEE ICORR 2017, London
PythonMATLABHMMLogistic RegressionSignal Processing

Software Validation Engineer — HPC Group

Jun 2016 – Aug 2016

Intel Corporation

King of Prussia, PA

  • Developed, executed, and debugged compute fabric validation plans and tests for Intel HPC products
  • Employed lean/agile practices for product software and system release quality
  • Developed programs to provide reliable and consistent test infrastructure for network systems and components
Software ValidationHPCAgileTesting

Software Developer — NAND Group

Jan 2015 – Aug 2015

Intel Corporation

Folsom, CA

  • Designed, developed, and validated testability circuit software for Non-Volatile Memory
  • Developed and debugged complex software programs to convert design validation vectors and drive test equipment
  • Tested, validated, modified, and re-designed circuit test programs to guarantee component margin to specification with emphasis on yield analysis
Software DevelopmentNAND MemoryValidationTest Engineering

Undergraduate Research Assistant

Nov 2012 – Dec 2014

UCF Intelligent Systems Lab

Orlando, FL

  • Conducted research in context-based learning and collaborative context-based reasoning applied to the development of a multi-agent collaboration system
Multi-Agent SystemsMachine LearningResearch

Summer Undergraduate Researcher

Jun 2014 – Aug 2014

UC Berkeley

Berkeley, CA

  • Conducted research in probabilistic inference and sequential decision making under uncertainty in robotics applications
Probabilistic InferenceRoboticsDecision Making

Probe Product Engineer — NAND Group

May 2013 – Aug 2013

Intel Corporation

Folsom, CA

  • Worked on development and debug of wafer level parametric, functional, and characterization tests, focusing on automation and analysis tools in NAND Memory Technology products
  • Gathered and analyzed data from wafer sort production runs and engineering experiments to guarantee component specifications and performance
  • Gave a Perl training class to Probe Engineers regarding methodologies to develop probability plots and automated detection of run-to-run yield toggles
PerlData AnalysisNAND MemoryTest Engineering

Product Development Engineer — Processor Group

Jan 2012 – Aug 2012

Intel Corporation

Folsom, CA

  • Process validation and knowledge of general test methodologies during the Ivy Bridge project
  • Built, debugged, compiled, and tested development tools in Visual Studio C# .NET and Perl
  • Code development and automation of manual and functionality test cases
  • Integrated, debugged, validated, and released module/test programs for HVM customers
C#.NETPerlATE PlatformsLinuxWindows

Knowledge Based Engineering Systems Intern

May 2011 – Aug 2011

Boeing

Bellevue, WA

  • Implemented automation on the CAD/CAE/KBE tool CATIA V5
  • Provided support for knowledge-based related products using the .NET programming language
.NETCATIA V5CAD/CAEAutomation

Projects

Featured machine learning and data science projects

entrepreneurshipCo-Founder & CTO

Onkopilot

Onkopilot · 2025 – Present

AI-powered oncology support platform providing personalized guidance for cancer patients and caregivers.

AI/MLNLPLLMsWeb Development+2
View details
entrepreneurshipCo-Founder & CTO

Xsporty

Xsporty · 2025 – Present

Sports platform connecting athletes to find games and build community, enabling seamless sports participation.

Web DevelopmentMobile DevelopmentReal-time SystemsCloud Infrastructure+1
View details
entrepreneurshipCTO

My Dental Path

My Dental Path · 2021 – 2025

Web platform empowering internationally trained dentists to navigate US dental school applications with expert guidance.

Web DevelopmentUI/UX DesignSEOResponsive Design
View details
entrepreneurshipCEO & Founder

Activo

Activo · 2023 – Present

Mobile app for discovering local events and experiences in Puerto Rico, built with Flutter.

FlutterDartFirebaseCloud Functions+2
View details
banking

Loan Portfolio Valuation Model

Popular Bank · 2023 – Present

ML model replacing third-party vendor for loan portfolio fair value estimation, saving $300K/year.

PythonLightGBMSQLAzure Databricks+2
View details
banking

ATH Movil AML Alert Detection

Popular Bank · 2023 – Present

ML-based anti-money laundering system reducing analyst workload by 10X.

PythonXGBoostLightGBMSQL+2
View details
banking

Customer Lifetime Value Prediction

Popular Bank · 2023 – 2024

ML model predicting customer lifetime value to optimize marketing and retention strategies.

PythonScikit-learnSQLAzure Databricks+1
View details
healthcare

Virtual Care Coordinator NLP Pipeline

MYHealth · 2022 – 2023

NLP system using BioBERT and ClinicalBERT to automate virtual care coordinator referral processing.

PythonspaCyBioBERTClinicalBERT+3
View details
healthcare

Cardiovascular Disease Prevention Risk Model

One Drop · 2021 – 2022

Risk prediction model deployed in mobile app serving 500K+ users for chronic condition management.

PythonConstrained OptimizationStatistical ModelingAWS+1
View details
healthcare

CMS RADV Medical Record Automation

KPMG · 2019 – 2021

NLP and OCR pipeline automating 10,000+ manual medical record review tasks for CMS audits.

PythonC#NLPOCR+3
View details
healthcare

Patient No-Show Prediction

Children's Hospital of Philadelphia · 2017 – 2018

Neural network model predicting patient appointment no-shows, winning the Analytics 50 Award.

PythonTensorFlowNeural NetworksLogistic Regression+2
View details
research

CLPsych 2019 Suicide Risk Assessment

Children's Hospital of Philadelphia · 2018 – 2019

NLP ensemble model for suicide risk detection from social media text, published at ACL CLPsych 2019.

PythonSVMNaive BayesNLP+3
View details
research

SCOSY Clinical Decision Support

Children's Hospital of Philadelphia · 2017 – 2018

Recommendation system using LDA and collaborative filtering for clinical decision support, published at IEEE EMBS.

PythonLDACollaborative FilteringNLP+2
View details
research

Upper Extremity Movement Classification

Columbia University · 2015 – 2016

HMM and logistic regression models for rehabilitation robotics movement classification, published at ICORR 2017.

PythonMATLABHMMLogistic Regression+2
View details

Skills

Technologies and tools I work with

Languages

PythonSQLRC#MATLABJava

Machine Learning & AI

Scikit-learnXGBoostLightGBMRandom ForestSVMLogistic RegressionEnsemble MethodsFeature Engineering

Deep Learning

TensorFlowPyTorchNeural NetworksHMMConstrained Optimization

NLP

spaCyNLTKBioBERTClinicalBERTTransformersText ClassificationNERSentiment AnalysisTopic Modeling

Cloud & Infrastructure

AWSAzureDatabricksDockerGitCI/CD

Databases

PostgreSQLSQL ServerMongoDBSnowflakeBigQuery

Visualization & BI

TableauPower BIMatplotlibSeabornPlotly

Generative AI

LLMsPrompt EngineeringRAGLangChainOpenAI APIClaudeCopilot StudioFine-tuning

Publications

Peer-reviewed papers and research contributions

Cardiovascular Risk Prediction for Mobile Health Applications

Jorge Guerra, et al.

ScienceDirect — Intelligence-Based Medicine · 2025

Cardiovascular Disease Risk Variability Over Time in People With Diabetes

Jorge Guerra, et al.

Circulation (AHA Scientific Sessions 2021), Vol. 144, Suppl. 1 · 2021

Continuous Cardiovascular Risk Estimation for People With Diabetes

Jorge Guerra, et al.

Circulation (AHA Scientific Sessions 2019), Vol. 140, Suppl. 1 · 2019

CLPsych2019 Shared Task: Predicting Suicide Risk Level from Reddit Posts on Multiple Forums

Victor Ruiz, Lingyun Shi, Wei Quan, Neal Ryan, Candice Biernesser, David Brent, Rich Tsui

ACL Workshop on Computational Linguistics and Clinical Psychology (CLPsych) · 2019

Prediction of One-Year Transplant-Free Survival after Norwood Procedure Based on the Pre-Operative Data

M. Luis Ahumada, Jacquelin Peck, Jorge Guerra, Nhue Do, Monesha Gupta, Sharon Ghazarian, Mohamed Rehman, P. Jeffrey Jacobs, Ali Jalali

40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) · 2018

SCOSY: A Biomedical Collaboration Recommendation System

Jorge Guerra, Wei Quan, Ao Li, Luis Ahumada, Flayton Winston, Ravi Desai

40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) · 2018

Capture, Learning, and Classification of Upper Extremity Movement Primitives in Healthy Controls and Stroke Patients

Jorge Guerra, Jasim Uddin, Dawn Nilsen, James McInerney, Ammarah Fadoo, Isirame B. Omofuma, Shatif Hughes, Sunil Agrawal, Peter Allen, Heidi M. Schambra

International Conference on Rehabilitation Robotics (ICORR) · 2017

Awards & Honors

Recognition for excellence in data science and research

Analytics 50 Award

Analytics Magazine · 2018

Recognized for innovative patient no-show prediction model at Children's Hospital of Philadelphia, selected among the top 50 analytics projects nationwide.

GEM Fellowship

National GEM Consortium · 2014

Awarded the prestigious GEM Fellowship for graduate studies in engineering and science, supporting MS studies at Columbia University.

McNair Scholar

Ronald E. McNair Post-Baccalaureate Achievement Program · 2013

Selected as a McNair Scholar, a program preparing underrepresented students for doctoral studies through research and mentorship.

Education

Academic background

Master of Science in Data Science

2015 – 2017

Columbia University

New York, NY

  • GEM Fellow
  • Focus on machine learning and statistical modeling

Bachelor of Science in Computer Engineering

2009 – 2014

University of Central Florida

Orlando, FL

  • McNair Scholar
  • Dean's List

Get in Touch

Interested in collaborating or learning more about my work? Feel free to reach out.

San Juan, Puerto Rico