I've always loved working with data. My favorite career moments have all involved analyzing complex data sets (e.g. error log files from large distributed systems), finding the underlying "story", and creating visualizations to communicate that story to wide audiences. Much of this work required developing complex software tools and simulators to accelerate root causing faults and guide research of new technologies. But these opportunities were always the exception and not the rule.
I didn't think it was possible to have a career doing mainly this kind of work. But, then I read books like MoneyBall, The Numerati, and Linked which showed me that I was wrong. Still, I wasn't sure how to pursue my interests in machine learning, statistics, programming, visualization, etc, without over specializing in any particular field. Then came the Data Science movement.
Since 2011, I've been eagerly pursuing the field. I started by reading many books on machine learning and data analysis. I progressed to taking Stanford's Data Mining and Analysis class (STATS202) and auditing Modern Applied Statistics (STATS315A). Shortly after it was created, I enrolled in the Mining Massive Data Sets Graduate Certificate program which I completed in 2013.
At the same time, I've gradually transitioned to roles at work focused on Big Data and Data Science. This work has provided opportunities to deeply understand a variety of Big Data technologies and put my Data Science skills to practice solving real-world problems.
In my spare time, I continue to expand my knowledge through self-education (books, Coursera, etc) and honing my skills through Kaggle competitions. I started this blog as a focal point and motivator for my on-going efforts to become an accomplished Data Scientist.
This is a personal blog. All opinions expressed are mine and not those of my employer.
I use Emacs and org-mode to write material for this site. It is generated with Jekyll and statically hosted on Amazon S3. The layout is based on the Bootstrap framework. Visualizations are done with D3.js.