About This Engineer

Massa Coulibaly

Full-stack & forward-deployed engineer. I build software end-to-end and embed with the people who use it — shipping in places where most tools don't reach.

Recent UC Berkeley MIMS graduate. Ex Co-founder & CTO of Afya — USSD-based EMR for Sub-Saharan Africa. Forward-deployed engineer who's architected, shipped, and handed off production systems on-site.

School:UC Berkeley MIMS
Prev:CTO @ Afya
Focus:Health IT · FDE
Seeking:FDE · Full-Stack

Experience

Stack Check

Stack Check — do I know your stack?

Click a skill → see projects → click project to jump there.

coreshippedfamiliar

languages

frontend

backend

databases

ml / nlp

llms / agentic ai

data viz / bi

cloud / infra

enterprise / domain

Master's Projects

MIMS '26 — capstones, course projects, and applied research

OfferBloom — AI Interview Prep

privatelive ↗

Full-stack AI interview-prep platform (MIMS Capstone). React + Vite frontend, FastAPI backend, Neo4j graph model, Cloudflare R2 storage, Claude Haiku integration for answer drafting, verbal-practice feedback, and resume coaching.

stack

ReactViteFastAPINeo4jCloudflare R2Claude HaikuPythonRailway

highlights

  • Designed Neo4j graph linking users, questions, answers, practices, and files
  • Seeded 320 role-specific interview questions across 8 roles and 8 categories
  • Integrated Claude Haiku for AI-assisted answer drafting and verbal-practice feedback
  • Built resume coaching flow with structured feedback on bullet quality

Clinical Trial Navigator (CTN)

Knowledge-graph platform matching sickle-cell patients to eligible clinical trials. Berkeley INFO 290 / DEVENG 204 capstone — addressing the <5% enrollment rate across 250+ active trials.

stack

PythonDjango 4.2RDF/TurtleSPARQLSHACLrdflibAnthropic APIRailway

highlights

  • Designed RDF/Turtle data model + RDFS schema with inference; SHACL validation via pySHACL
  • Built SPARQL eligibility-matching engine over 250 ingested trials
  • Django web app with patient registry, trial browser, and live matching demo
  • Pitched $150K seed with TAM/SAM/SOM model targeting expansion to 5 countries

The Sound of Power — UN Speeches NLP

ANLP final project. NLP pipeline analyzing rhetorical strategies across 2,069 UN Security Council transcripts using LDA, BERTopic, and transformer-based LLMs.

stack

PythonBERTopicLDATransformersHuggingFaceAnthropic APIspaCyGensim

highlights

  • Designed a multi-prompt LLM workflow (HuggingFace + Anthropic API) for thematic comparison across blocs
  • Tested what drives linguistic variability in diplomatic speeches
  • Found resolution type outweighs country identity and bloc membership
  • Cleaned and normalized noisy OCR-extracted text from scanned UN documents

Joke Generator

Front-end Web Architecture class project. React app fetching random jokes from an external API in a responsive interface.

stack

ReactCSSJoke API

highlights

  • Integrated external Joke API with live fetching on button click
  • Built with modular, reusable React components
  • Fully responsive layout that adapts across device sizes

Web Dev & Systems

Ghana MMDA Budget Transparency

privatelive ↗

Architected and built end-to-end ETL pipeline ingesting 1,400+ municipal budget PDFs (2014–2026) across 261 Metropolitan, Municipal & District Assemblies — Ghana's first centralized fiscal transparency platform. Solo build on $51K budget.

stack

PythonpdfplumbercamelotPostgreSQLReactAWS (RDS/S3/EC2)Azure Form RecognizerGPT APIsVercel

highlights

  • Format-aware parsing engine routing PDFs across 5 document generations (Old Composite Budget, PBB v1–v4)
  • Normalized PostgreSQL schema aligned to GFS/COFOG codebooks with currency + inflation adjustments
  • Validation + approval workflow (Draft → Submitted → Review → Approved → Published) with field-level audit trail
  • REST API + React dashboards for Budget Explorer, comparative analytics, dev-vs-recurrent splits
  • Azure Form Recognizer + GPT for OCR/extraction on low-quality scans

Afya Health Information Systems

private

Co-founder & CTO. USSD-based EHR platform enabling clinics in low-resource settings to securely capture patient data offline and sync with national health systems when connectivity returns. Big Ideas Berkeley finalist.

stack

USSDPythonPostgreSQLOffline-firstHealth IT

highlights

  • Selected as Big Ideas Berkeley finalist for transforming record-keeping in Sub-Saharan Africa
  • Designed offline-first sync architecture for intermittent connectivity
  • USSD interface targets clinics without smartphones or stable internet

mibegnon-project

Full-stack scholarship platform with authentication, user dashboard, and automated scholarship scraping — built with Next.js, TypeScript, and Prisma.

stack

Next.jsTypeScriptPrismaPythonJupyter

highlights

  • Built role-based auth with protected dashboard and public-facing routes
  • Automated scholarship discovery via a Python scraper feeding into a Prisma-backed database
  • Designed schema to track scholarship deadlines, eligibility, and user applications

HIMS — Friends Eye Center

Built a Healthcare Information Management System for a clinic in Ghana on top of OpenEMR. Includes a data migration pipeline using Python and Pandas to move legacy Excel records into the new system.

stack

PythonPandasSQLOpenEMRGhana 🇬🇭

highlights

  • Wrote SQL migration scripts via Pandas to transfer clinic data from Excel sheets into the new OpenEMR-based system
  • Clinic proposal and initial analysis included in the repo — shared with permission from Friends Eye Center
  • Replaced entirely paper-based workflows across three sites: patient records, billing, and appointments
  • System still in use 2+ years later; trained KNUST and Technical University of Tamale students to maintain it post-handoff

Friends Eye Center — Public Website

Public-facing website for the FEC clinic in Kumasi, Ghana. Marketing, services, and contact for the eye center I previously digitized.

stack

Next.jsReactTypeScriptTailwindVercel

highlights

  • Designed and shipped a responsive marketing site for the same clinic I built the HMS for
  • Deployed on Vercel with continuous delivery
  • Companion piece to the OpenEMR-based HIMS — full lifecycle from clinical software to public web presence

Data Science & AI

Billboard Hit Prediction Tool

Built and evaluated four ML models on 3,000+ songs using Spotify, Billboard and YouTube APIs, achieving over 80% accuracy predicting chart success.

stack

PythonScikit-learnSpotify APIBillboard APIYouTube APIPandas

highlights

  • Compared Logistic Regression, Random Forest, XGBoost, and SVM — XGBoost topped 80% accuracy
  • Handled severe class imbalance between charting and non-charting songs via SMOTE
  • Engineered audio features (danceability, valence, tempo) into model-ready inputs across 3 APIs

ETF Stock Analysis

Constructed time series models to demonstrate the impact of US Unemployment Rate announcements on QQQ, IWM, and SPY ETF prices.

stack

PythonstatsmodelsARIMAGARCHPandasMatplotlib

highlights

  • Modeled volatility clustering around macro announcements using GARCH
  • Ran event-study analysis isolating pre/post announcement price reactions
  • Aligned non-synchronous financial time series from multiple data sources
massa@portfolio:~$ whoami
> Massa Coulibaly, Full-Stack & Forward-Deployed Engineer
> Ex CTO @ Afya · UC Berkeley MIMS '26
> Contact: massa@berkeley.edu
>