Hadoop
Tommy + the technology of Hadoop
Home Capabilities Technologies Hadoop
Hadoop Examples

MapR Technologies
Big Data / Hadoop Distributor2015 - 2018
A portal bringing together version control, automated test definitions and statuses, quality metrics, jira tickets, CICD jobs, and supportinginfrastructure definitions and status into a single place to aid in release management and devops practices.
Behind the scenes, pipelines made with K8s (Kubernetes), Mesos, Github and Jenkins automatically provisioned environments, deployed our software and ran extensive tests on it, including complex multi-cloud platform scale tests across Google Cloud and AWS as well as on prem with bare metal and Open Stack
Observability was introduced to MapR via the Spyglass Project, which sought to obtain metrics from workloads as well as application specific metrics across all the tools and infrastructure of the MapR Hadoop Stack.
My responsibilities included automating the build and deployment of the full hadoop stack under development, automated test authoring and execution, mentoringjunior teammates to do the same, collaborating with dev teams to ensure they plugged into our CI/CD and Test framework nicely, andtroubleshooting problems that arose.
Key Results
- Unified 5 disparate DevOps tools into single portal reducing context switching by 80%
- Implemented comprehensive observability across Hadoop stack monitoring 100+ metrics
- Automated multi-cloud testing across GCP, AWS, and on-prem reducing test setup time by 80%

Explorys
IBM Watson Health2012 - 2014
New Admin Dashboard - Architected and implemented green field admin dashboard, worked with junior devs and leadership to bring about in the requested stack: Bootstrap / JQuery / Ruby / Java / MySQL / LDAP. Users used Admin Dashboard to administer Organization, Role, and other management aspects of the Explorys EPM Suite.
Dependency Injection Framework - To enable high testability, all components required dependencies to be passed in, and so eliminating factory boilerplate without adopting a complex DI system was desirable. Wrote and shared a simple runtime-reflection based DI framework with multiple projects / teams
Temporal Sequencing - Architected, implemented and tested an enhancement to our proprietary Explorys MDL (measure definition language), which gave authors who use that language the ability to express dynamic temporal conditions as part of the predicate calculus that constitutes the measure being defined in a given MDL instance.
The solution runs in Hadoop as a mapper, where the MDL is parsed and a corresponding object graph is constructed, and then control is passed to the encapsulating object of that graph, resulting in a determination about the level of adherence of the given patient to the measure defined by the MDL which is then emitted for further processing.
The challenge here was authoring the solution without knowing the history or complexity of the existing system. The challenge was overcome by applying SOLID principles and TDD principles to express assumptions about the existing external system as interfaces, getting the solution to work in a test sandbox as a standalone application, and then once that was done, writing adapters between the actual system and the interfaces that were formerly mocked out for testing. The team which owned the system seemed impressed by and curious about the solution and the new patterns / tests involved.
By taking what had been considered a relatively tough problem, and, as an engineer with no history in the measure engine, applying an approach that allowed a fully tested solution to ultimately be plugged into the larger system very cleanly, as an implementation of a clear set of interfaces along the border of that system, I exposed the team to new patterns and solution design approaches that I hoped the team might take forward and build on as the system continued to evolve.
Example Use Case: Doctor wants measure of % of his/her patients having Hospitalization with outcome class X, where either 1 week later, Rehospitalization occurs or within 2 weeks, unexpected Office Appointment occurs. Doctor expresses using natural language oriented temporal sequencing expressions within model definition language.
Key Results
- Delivered new admin dashboard supporting 10+ healthcare organizations
- Enhanced MDL enabling complex temporal healthcare measure definitions
- Reduced user management operations time by 65% with streamlined UI