Work Experience
Qure.ai
Title History
| Senior AI Scientist | Level 3 | Apr 2024 - Present | 1 year 8 months |
| AI Scientist | Level 2 | Jul 2022 - Mar 2024 | 1 year 9 months |
| AI Scientist | Internship | Jul 2021 - Jun 2022 | 1 year |
Overview
Responsibility
Projects
1) Data & Annotations
Summary: Own the data acquisition, curation, and annotation programs for two CT-first, multimodal products, safeguarding 30+ TB of vendor, research, and client data (arriving via S3 buckets, other cloud shares, and literal hard-drive shipments) while transforming every raw submission into a standardized, analysis-ready corpus. Built an end-to-end operating model—from ingestion and metadata modeling to annotation orchestration and QA—that keeps R&D unblocked, gives product teams instant visibility into data readiness, and sustains high annotator satisfaction even as volume exploded.
Scope & Data Governance: 30+ TB, multi-modality, compliance-ready
- Manage several terabytes (30+ TB) of training, validation, and testing data spanning CT, CTA, perfusion, X-ray, PET-CT, biopsy, and hybrid patient datapoints sourced from vendors, open datasets, research collaborators, and both prospective and live clients—regardless of whether the drop arrives through S3/GCS/Azure shares, secure FTP, or encrypted hard drives.
- Ingest unstructured deliveries, normalize them into a universal folder/schema layout, and capture every metadata field in strongly-typed BSON documents with canonical naming so downstream teams can query any cohort without spelunking raw disks.
- Run automated integrity checks (modality compliance, DICOM completeness, scan contract specs, corruption and duplication detection) before a study is accepted, and log all results to a Postgres-driven ingestion tracker that ultimately publishes authoritative references into MongoDB.
- Classify each DICOM series into modality / protocol buckets (non-contrast head CT, CTA, perfusion, etc.), tag their viable problem statements, and persist those tags in MongoDB for instant cohort filtering.
- Cache frequently accessed scans as memory-efficient
safetensorsblobs, shrinking read latency by ~80% while staying within tight on-prem / cloud storage budgets. - Ensure every ingestion, transformation, and storage workflow adheres to the company’s multi-framework regulatory obligations for medical data handling, with audit-ready trails baked into the process definitions.
Annotation Operations: Taxonomy, tooling, validation
- Define per-problem taxonomies, sampling rules, and prioritization logic so every annotation batch directly advances a product hypothesis or research deliverable.
- Authored upload/download automation for the RedBrick AI portal, including scripts that package imaging + metadata, push batches, and pull completed labels with version tracking.
- Built validation daemons that inspect coverage immediately after each datapoint is annotated, flagging incompleteness or schema drift in near real time and driving re-annotation rates to ~0%.
- Process downloaded annotations into lightweight, queryable stores with fast filtering (e.g., by modality, labeler, abnormality) so model training pipelines can materialize cohorts without manual wrangling.
- Developed NLP + LLM parsing utilities powered by a stack of proprietary models, GPT, Gemini, and self-hosted Qwen instances to read radiology reports, extract hierarchical findings/attributes, and align them with structured tags to boost weak supervision and cohort triage.
Quality & Issue Resolution: Concordance-first workflows
- Partnered with product POCs (who manage annotator contracts) to codify escalation paths so R&D only intervenes for nuanced clinical clarifications while still getting rapid answers.
- Tackled ambiguous problem statements by rolling out secondary review passes, consensus templates, and label-specific heuristics that maximize usable signal despite inherent reader variability.
- Maintained structured QC logs that correlate annotator performance, modality difficulty, and downstream model impact, ensuring noisy labels are filtered or reweighted before training.
- Increased annotator satisfaction (surveyed by the product team) by giving them clearer instructions, tighter taxonomies, and faster tooling.
Team Leadership & Collaboration: Scaling via playbooks
- After bootstrapping the system, recruited, trained, and now supervise three data engineers who run day-to-day ingestion and annotation ops, each capable of adapting the framework to new client quirks or modalities.
- Provide KT packets, SOPs, and shadowing sessions so engineers can troubleshoot vendor datasets independently while still escalating blocking edge cases to me.
- Coordinate with the broader product, BI, and engineering orgs for capacity planning, security reviews, and audit readiness.
Impact: Throughput, latency, satisfaction
- Achieved a raw 18× increase in annotation throughput; even after normalizing for 3× higher label demand and a 2× larger annotator pool, per-annotator productivity still improved ~3×.
- Validation scripts cut re-annotation loops to nearly 0%, freeing annotators to focus on net-new data instead of fixes.
- Safetensor caching reduced data access time by ~80%, enabling near-immediate fulfillment of ad-hoc analysis and training-data requests.
- Maintained annotation quality at previous baselines despite much higher volume, while overall turnaround for data/annotation analysis requests dropped sharply thanks to the trained three-person team (qualitatively observed by stakeholders).
2) Production Codebase
Summary: Architected and own a production‑grade AI pipeline framework for head CT/CTA/MRI triage that reduced turnaround time by 57%, increased automated test coverage from 22% → 91%, cut new‑model integration time from ~8 days → ~1 day, and dropped configuration errors from ~700/year → 0/year along with processing errors from ~500/year → 2/year.
Ownership & Scope: Hybrid deployments, resource-aware, 2k scans/mo
- Lead and own the end‑to‑end AI production codebase for the neuro‑imaging triage product, covering head CT, CTA, and MRI across on‑prem, cloud, and hybrid deployments.
- Designed the system to operate reliably under CPU/GPU constraints (e.g., production on AWS
g4dn.4xlargewith limited GPU and RAM), adding failsafes so models continue to run gracefully even under resource pressure. - Support ~2,000 valid series per month through this pipeline while maintaining strict reliability and performance guarantees.
Architecture & Implementation: Modular TorchScript graph pipelines
- Designed and implemented the entire Python codebase from scratch, building a modular, fault‑tolerant pipeline framework that decouples TorchScript‑serialized models per CT/CTA study, each with its own preprocessing, postprocessing, and multi‑level collation stages.
- Introduced a custom process‑graph / conditional‑subgraph abstraction with node‑level logging, benchmarking, and multiprocessing, allowing independent steps to run in parallel, with clear failure tracebacks and automatic merging of longest‑common‑prefix graphs to deduplicate shared processing.
- Ensured backward compatibility so existing models and legacy pipelines continue to run unchanged, while new models can be plugged into the same modular framework without disrupting production.
- Implemented output caching and database‑backed persistence for intermediate and final results, keeping outputs serializable and memory‑efficient to respect RAM limits on large CT scans.
Configuration Management & Safety: Nested Pydantic configs, backward compatible
- Architected a nested Pydantic‑based client‑configuration schema that strictly validates model selection, thresholds, routing, and reporting rules, making it effectively impossible to persist invalid configurations in the database.
- Migrated all existing client configs into the new structure, validating them and enforcing a standardized, modular config format where all clients share a common base schema plus optional, explicit customizations.
- Passed model outputs at every collation stage through Pydantic model classes as an additional safety and consistency check, even though the code path already guarantees correct types in 99.99% of cases.
Quality, Testing, and Tooling: 2k+ tests, enforced pre-commit + CI
- Built >2,000 unit and integration tests using pytest (with parameterization), including end‑to‑end tests for every model pipeline on real data and collation‑level tests on real outputs, raising coverage from 22% to 91%.
- Enforced a high‑quality development workflow with Black, isort, Flake8, and pyupgrade, all wired into pre‑commit hooks so every contributor follows the same style and linting rules.
- Integrated with GitHub Actions (for CODEOWNERS and review ownership of the R&D code) and Jenkins (for build, test, and deployment pipelines managed by engineering), and worked with QA/engineering teams who run additional regression and golden‑case tests before productizing new models.
- Documented the codebase with Google‑style docstrings, type hints, and graph visualizations of the processing pipelines to make behavior transparent for both engineers and non‑R&D stakeholders.
Reliability, Monitoring, and Impact: Resource benchmarking, Slack alerts, < 2 incidents
- Performed extensive benchmarking under multiprocessing to ensure that total CPU, GPU, and RAM usage stays within safe limits; added failsafes and graceful‑degradation paths for resource‑related failures.
- Enabled Slack‑based alerting (implemented with the engineering and business‑integrations teams) for errors, weekly volume metrics, and other key product KPIs.
- Reduced end‑to‑end CT/CTA turnaround time by 57%, cut new‑model integration time from ~8 days to ~1 day, and achieved zero configuration‑related failures with only two processing‑pipeline issues in two years, both resolved in ≤2 days (vs. ~14 days previously).
Collaboration & Leadership: Sole R&D owner → mentor, cross-org standards
- Served as the sole R&D owner for the production pipeline, then transitioned into a mentorship and review role after knowledge transfer, as additional engineers began contributing to the codebase.
- Standardized pre‑commit and testing practices for all technical contributors working on this product, driving a company‑wide uplift in code quality for this area.
- Collaborated closely with the product’s engineering lead, core engineering team, and business‑integrations team to align interfaces, deployment strategy, monitoring, and operational processes.