Cloud Collaboration with
Crunch and crunch

- Instant, visual, collaborative data analysis - Intuitive GUI for easy exploration — no programming required - Public REST API, libraries for integration with other domains (R, Python, …) — for when you want to code

Why Crunch?

Why Crunch?

- Interface design limitations of traditional software - Technical skill required ≫ conceptual complexity of questions
  • E.g. % of males 18–24 that like popcorn at the movies? Significantly different from females?

Why Crunch?

Because collaboration with data is hard.

Even harder with…

- Different skill levels - Different tools/software - Different objectives
Leads to communication via export/import, copy/paste. Iteration is painful.
[gui application]

Challenge: Interface Design

Single source of truth is great, but
- Different skill levels - Different concepts - Different objectives

How do we design interfaces that work for these different audiences?

The crunch package

Idiomatic R interface to cloud service


Idiomatic R

- The `data.frame` - Columnar, heterogeneous types - Indexing with `$`, `[`, `[[` - Formulas

The crunch package

- Uses same public HTTP API as web app - (Almost) never need to know that - Presents abstraction that datasets and variables are in local memory - Design interface around what an R user needs to do, not what HTTP dictates
[using rcrunch]
![](assets/crunch.png) ## How? - Lots of S4 classes and methods - `[` sometimes GETs - `names<-` calls PATCH - `$<-` does POST or PATCH - … except when I had to use S3 - ``


Focus on interface by test-driving

with(test.authentication, {
   with(test.dataset(df), {
       try(ds$v3a <- ds$v3 + 5)
       test_that("A derived variable is created on the server", {
           expect_true("v3a" %in% names(allVariables(refresh(ds))))


Focus on interface by test-driving

setup.and.teardown <- function (setup, teardown,".setup") {
   structure(list(setup=setup, teardown=teardown,,

with.SUTD <- function (data, expr, ...) {
   env <- parent.frame()
   assign(data$, data$setup(), envir=env)
   try(eval(substitute(expr), envir=parent.frame()))

test.authentication <- setup.and.teardown(
   function () suppressMessages(login()),

Collaboration: Crunch style

- Make all more productive, less frustrated - Everyone can work at own level and meet needs - Less time for data analyst to do menial tasks, more time for what they’re good at - Less latency in communication

Deliver dynamic data, not static reports


- Collaboration is hard, especially with diverse skills and domains - Cloud can solve “single source of truth” but interface design problem remains - provides cloud computing + intelligent interface design for both technical and non-technical audiences

Questions? Comments?