Hey, I'm Jad.
Here's a bit about me:
My full name is Ali Jad Khalil. I'm a Software Engineer currently based in Washington DC. I graduated from the University of Maryland (College Park) with Bachelor’s degrees in Computer Science and Finance.
In the decade-plus since college, I've built out complex technical solutions (primarily in the areas of Deep Learning and cybersecurit) both at early-stage companies and on multiple new teams at the world’s largest cloud provider (AWS). During that time, I've leveraged a wide array of languages (e.g. C/Java/Python), frameworks (e.g. TensorFlow/PyTorch/Spark), and platforms (e.g. AWS/ Kubernetes/GitHub). Driven by a passion for innovation and a desire to deliver solutions benefitting the general public, I'm continually seeking opportunities to apply my diverse technical background to impactful AI and security projects -- whether with a forward-thinking company or an independent venture.
In my spare time, I'm an avid fan of yoga, long meals with friends/family, travelling, hiking, coding on personal projects, listening to podcasts and non-fiction books (on philosophy, health, history, biographies, psychology), and watching/playing sports (i.e. mostly tennis and basketball).
Career Path
By working on both sides of the development cycle (i.e. both building and breaking code) at various levels of the technical stack, it drastically enhanced my abilities as an engineer. I became adept at always incorporating secure coding practices, weighing the performance ramifications of hardware/software design decisions, and using an assortment of Linux-based networking, development, and reverse-engineering tools optimally.
In early 2017, I began following the recent breakthroughs in Computer Vision using Deep Learning (DL) and quickly developed a serious interest in the field. To gain a deeper understanding, I enrolled in the Udacity Machine Learning Nanodegree and started fervently reading scores of AI research papers. It was then that I came up with the idea of developing an AI start-up for generating annotations and basic analytics on high school sports game footage by processing it through a multi-component ML video processing framework. Soon thereafter, I left ISE to create this product as a business.
Faint Stats (Founder at AI Start-up) – To get off the ground, I started by experimenting with DL frameworks by implementing small versions of famous models (e.g. LSTM, ResNet, Transformers) on my custom-built server at home (seen below). Once I gained enough familiarity with DL libraries and technologies, I was ready to start training larger and more complex distributed video processing networks using AWS nodes (containing high-end GPU’s) on a variety of video datasets (including Kinetics, Sports 1M, UCF-101, Moments-in-Time, and my custom NBA play-by-play dataset).
Amazon Web Services (AI/Security Engineer) -- As my first task on my new team at Amazon, I was asked to design and implement the system for deploying and managing the infrastructure (such as ASGs, NLBs, EC2 fleets with multi-thousand hosts, databases, logs, alarms, etc) needed by our disparate micro-services available in practically every AWS Region. During this period, I also routinely built out new functionality required for launch (such as the entire AWS Lambda-based solution for charging our customers millions of dollars each month). At one point, I identified a need for specific log deletion functionality within the public codebase at the Amazon, so I put together and published a log deletion and compression package. (It is now leveraged by hundreds of teams at Amazon!) Once we launched AWS Network Firewall in Q4 2020, my role shifted more towards assisting with our large operational load. This included performing tasks such as responding to deployment incidents and uncovered live bugs, answering customer questions/concerns, and patching hosts to ensure software compliance. While I enjoyed this new work, I had now been on the team for three years and still had an itch to re-enter the AI space. It made sense to begin searching for another team within AWS using state-of-the-art Machine Learning technologies at scale.
Independent Security Evaluators (Security Engineer) -- After getting my Computer Science degree from UMD in mid-2012, I joined a young IT security company formed by a group of famous ethical “white-hat” hackers at Johns Hopkins. As a Security Software Engineer there, I was mainly tasked with maintaining, enhancing, and integrating a multi-functional security C library available on a multitude of operating systems. I also participated in web and mobile application vulnerability assessments as well as seminal research on router security presented at globally renowned conferences like DefCon.
During the data collection, training and evaluation of the video models, I became intimately familiar with the full ML pipeline. As the first step before training, I gathered and efficiently processed TB-sized datasets using Spark jobs. Next, I configured an EC2 image (AMI) for the high-performance cluster (HPC) nodes with a Horovod, NCCL, and OpenMPI stack so that each node would automatically participate in distributed training for the candidate video models after being launched. Once that set-up was stable, I iterated on various large network architectures in Keras by training them with multiple hyperparameters, model layers/sequencing, and loss functions. Finally, for evaluation, I visualized accuracy metrics on the test data across training sessions using TensorBoard (part of the TensorFlow project).
By 2019, after many months of testing designs for the video processing framework, my best solution was not quite accurate enough on real world high school sports footage to be commercially viable. Furthermore, similar to many other DL projects (particularly for data and compute-intensive domains like video processing), experimentation was becoming prohibitively expensive. So after much back and forth, I decided to summarize the key discoveries and pain-points from my experience in blog posts and tutorials (on this website) and then start looking for a job at an established company. Since I had yet to work somewhere operating at a global scale, I ultimately accepted an engineering position at AWS with a team launching a new network security service.
By mid-2022, I joined the AWS AI group to help create a service (AWS Clean Rooms ML) for recommending users in online ad campaign using ML models trained by AWS on customer data stored in a privacy-preserving AWS Clean Rooms. In the two years before we went public, I worked on our entire end-to-end tech stack including the set-up and design of API models, orchestration of back-end SageMaker (ML) and Glue (data processing) workflows, profiling and optimization of our ETL pipeline, addition of ML-specific logic (e.g. training data augmentation techniques, programmatic selection of the appropriate instance and fleet size for a job, migration to
PyTorch Lightening), and curation of datasets for asserting system reliability/scalability. On top of that, I constructed a highly cost-effective and extensible framework for our integration and canary tests that caught scores of bugs by regularly covering all our service’s synchronous and long running functionality in nearly every possible happy and failure path. Even with an undersized team, we managed to release AWS Clean Rooms ML to the public by Q2 2024. At launch, the service was capable of training large User Behavior Models (with billions of parameters) on massive publisher datasets (containing tabular data for tens of millions of users) in 24 hours at most and also supporting thousands of concurrent Audience Generation Jobs (in each Region). Today, we are working on providing customers the ability to migrate their own User Behavior Models into the service and leverage them during audience generation.
Looking Forward
As AI has become increasingly powerful and ubiquitous over the past several years, the fields of AI and security have started overlapping in some notable ways. For instance, pre-existing security products are beginning to incorporate new advances in AI to better detect and mitigate against threats. Similarly, real-world systems leveraging Deep Learning now need to protect against a growing number of vulnerabilities (e.g. information leaks, misleading inputs, hallucinations) and must always operate in alignment with human intentions. This general space of AI is often called “AI Safety”.
Having worked extensively in both the domains of AI and security on multiple teams within AWS, I'm now well versed in current technologies and fully aware of the requirements for launching and growing a ML/security service to global scale. Armed with that experience in addition to my previous time at smaller early-stage companies, I feel uniquely equipped and even compelled to always consider any opportunities (e.g. companies or independent projects) in the domains of security and Machine Learning where I can continue to grow my skills and also help build solutions positively impacting people’s lives.