Google

  • Tensorflow: TensorFlow is an open source software library for numerical computation using data flow graphs; primarily used for training deep learning models.
  • Tensor Processing Units: Tensor Processing Units (TPU) are custom ASICs designed for accelerating machine learning research and production systems. They are used by projects such as Gmail, Photos, Translate etc.
  • Apache Beam: A unified model for defining both batch and streaming data-parallel processing pipelines, as well as a set of language-specific SDKs for constructing pipelines and Runners for executing them on distributed processing backends like Apache Spark, Apache Flink, and Google Cloud Dataflow.

Coursera

  • Recommendations: Core service for all recommendation systems at Coursesa, currently used on the homepage and throughout the content discovery process. Created weekly and daily email jobs for sending recommendations to learners.
  • Content Discovery: Improved content discovery by building a new onboarding experience on coursera. Using this to personalize the search and browse experience. Also worked on ranking and indexing improvements.
  • Notifications: Service for sending email, push and in-app notifications. Involved in features such as delivery time optimization, tracking, queuing and A/B testing. Built an internal app to run batch campaigns for marketing etc.
  • Nostos: Bulk data processing and injection service from Hadoop to Cassandra and provides a thin REST layer on top for serving offline computed data online.
  • Workflows: Dataduct an open source workflow framework to create and manage data pipelines leveraging reusables patterns to expedite developer productivity.
  • Data Collection: Designed the internal survey and crowd sourcing platfowm which allowed for creating various tasks for crowd sourding or embedding surveys across the Coursera platform.
  • Developer Environment: Analytics environment based on docker and AWS, standardized the python and R dependencies. Wrote the core libraries that are shared by all data scientists.
  • Data Warehousing: Setup, schema design and management of Amazon Redshift. Built an internal app for access to the data using a web interface. Dataduct integration for daily ETL.
  • Course Dashboards: Instructor dashboards and learner surveying tools, which helped instructors run their class better by providing data on Assignments and Learner Activity.

Other Projects

  • QuantSoftware Toolkit: Open source python library for financial data analysis and machine learning for finance.
  • Portfolio Management: Created models for portfolio hedging, portfolio optimization and price forecasting. Also creating a strategy backtesting engine used for simulating and backtesting strategies.
  • 3D Interaction Controller: Prototyped a motion capture system for controlling a 3D image in realtime using hand gestures. This was work was later published across 2 papers.
  • GitViz: Data Visualization of commits in a Github organization built using D3.js.
  • Mac-Setup: Open source book that gives step by step instructions on setting up developer environment on Mac OS.
  • Travelopedia: Data Visualization of my travels across the globe to track the number of countries and states visited.
  • Latex Resume Template: Open source resume template in Latex for Software Developers.