Skip to main content
  • Conferences
  • Students
Sign in
Gold Sponsor
[Amazon logo]
Gold Sponsor
Gold Sponsor
Silver Sponsor
Silver Sponsor
Bronze Sponsor
[Demonware logo]
Bronze Sponsor
General Sponsor

USENIX ATC '15 button

Get more
Help Promote graphics!


  •  Twitter
  •  Facebook
  •  LinkedIn
  •  Google+
  •  YouTube
Tweets by @usenix
  • Event Code of Conduct
  • Conference Network Policy
  • Statement on Environmental Responsibility Policy
Tweet

connect with us

David Mah, Dropbox

Abstract: 

At Dropbox, to bridge the gap between “scripts” and “fully automatic automation”, we’ve introduced a concept of “Human Authorized Execution”. This means that a tool automatically finds problems and decides how to fix them, but a human operator is required to audit the tool’s decisions before the automation may run.

Why do we need this? Because it’s terrifying to have automation run fully automatically. With a human involved, their intuition can answer a really important question: Why might I NOT want to run this script? If we took a simple approach… for instance deploying a cron job to run our scripts whenever alerts fire, then we would lose that human’s sense of danger.

At Dropbox, we’ve built an alert auto-remediation platform which forces us to build our maintenance automation in a way that adheres to these principles. Through it, we’ve been able to overcome our discomfort with risky automation and transition our way into actually running scripts fully automatically.

In this talk we will discuss the thought process we bring towards building trustworthy automation, how we’ve driven our infrastructure organization towards a culture of embracing it, and simple steps that you could take to start gaining similar benefits in your organization.

This talk is targeted towards helping organisations who do not currently have extensive automation but wish to put together a road map on how to move towards fully automated operational infrastructure.

David Mah is an SRE at Dropbox, where he built out several of Dropbox’s “Magic Pocket” storage system’s verification and safety subsystems. More recently, he built Dropbox’s Naoru - an automation platform that is used ot de-risk dangerous maintenance automation tasks.

On the flip-side of career interests, David cares a lot about how to keep folks growing and happy. Towards this, he runs Dropbox’s engineering internship program and is heavily involved in SRE recruiting, particularly university recruiting.

David Mah, Dropbox

David Mah is an SRE at Dropbox, where he built out several of Dropbox’s “Magic Pocket” storage system’s verification and safety subsystems. More recently, he built Dropbox’s Naoru - an automation platform that is used ot de-risk dangerous maintenance automation tasks.

On the flip-side of career interests, David cares a lot about how to keep folks growing and happy. Towards this, he runs Dropbox’s engineering internship program and is heavily involved in SRE recruiting, particularly university recruiting.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@conference {208528,
author = {David Mah},
title = {Bridging the Safety Gap from Scripts to Full {Auto-Remediation}},
year = {2016},
address = {Dublin},
publisher = {USENIX Association},
month = jul
}
Download
View the slides

Presentation Video 

  • Log in or register to post comments
[Amazon logo]
[Demonware logo]
  • Privacy Policy
  • Contact Us

© USENIX
EIN 13-3055038