Tutorials

The Big Picture

  • Why RCloud?
    1. Increase Insights on Big Data Analyses through Collaboration

    2. RCloud enables Data Scientists to find value in data by sharing ideas and techniques with each other and results with any user in the community. RCloud employs notebooks composed of cells where topics may easily be broken up into concepts and where the relationship between the concepts can be easily understood.

      In this environment, data scientist teams benefit from easier sharing of scripts and data feeds, experiments, annotations and automated recommendations which are well beyond what traditional individual or locally based development environments provide. Since notebooks are saved and searchable, rather than spending time "recreating the wheel", RCloud Data Scientists may reuse notebooks or parts of notebooks for similar projects.

    3. Build Confidence with Reproducible Results

    4. Notebooks are project packages which contain all the components and dependencies of an analysis, including code, data, comments and other technical documentation, charts and deployment capabilities. This means results may be be verified and reproduced by anyone with access to the notebook.

    5. Engage Users with Convenient Web-based Access

    6. As opposed to analysis which are stored locally, web-based content allows Data Scientists to collaborate over the Internet without concern for the location or syncing of system parameters, code and data. For example, the public instance of RCloud includes all CRAN packages.

      Local development environment parameters no longer create hurdles for viewing and running analyses since RCloud is platform independent and user access and controls remain constant which promotes user confidence and engagement.

    7. Access Big Data Processing Tools and Javascript Libraries using Built-in Features

    8. RCloud supports all R packages including iotools for processing large data files and SparkR - an R frontend for Apache Spark.

      An RCloud session runs on both client and server, so it is possible for R functions on the server to call JavaScript functions on the client, and vice versa. This means rich web-based content through easy integration of custom Javascript user interface (UI) and Javascript libraries such as JQuery and D3.

    9. Insure Privacy with OCAP Implementation and Notebook Encryption

    10. RCloud supports efficient, secure, client/server connections via the FastRWeb package and an adopted discipline known as the Object-Capabilities (ocap). This means that web browsers never directly instruct the RCloud backend to execute arbitrary code which prevents unauthenticated clients from making unauthorized calls to the RCloud runtime environment; read more about ocap on our Documentation introduction page or see the Wikipedia article. RCloud also maintains an automatic git based trail of code modifications which document the development history of an RCloud based project and RCloud notebooks may be encrypted for added security. In the notebook product space, authenticated client-server channeling and notebook encryption are unique to RCloud.

    11. Transform Knowledge into Assets with Built-in Version Control

    12. Code, widget, dashboard and analysis libraries are an ancillary benefit that comes with the creation of every RCloud notebook (on the public instance or in private clouds).

      These "knowledge assets" may used on similar projects or to train new employees, for example. RCloud notebooks are searchable and forkable (sharable), so anyone with access to the notebooks may view and reuse a notebook and its content or various parts.

    13. Minimize Development to Deployment Effort through Notebook Sharing

    14. Standard analytic workflows often involve performing a data science experiment using one set of tools and deploying the results using other tools.

      Rather than stitching pipelines together and/or dealing with inconsistencies between development and production environments, every RCloud notebook is named by a URL so analyses may easily be transformed from code and technical documentation to markdown annotated notebooks or rich web-based dashboards.

    15. Access the Benefits of Using Open Source Software

    16. RCloud is an ideal environment for research and data analysis departments who commonly use R and/or Python to share, publish and archive their code and data. RCloud is roughly the same in terms of compute efficiency as in memory Business Intelligence (BI) tools like Tableau and Qlikview, but unlike these systems, you can add users to RCloud without licensing concerns.

      Open source software also means access free online help and user training through forums like Stackoverflow.

      SystemCollaborationBuilt-in SecurityNo-cost User ScalabilityVersioning / ForkingDashboardsMulti (Programming) Language SupportIntegrated ReportsIntegrated Analyses
      RCloudXXXXXXXX
      RStudioXXX
      RStudio Shiny ProXXXX
      JSFiddleXXXX
      bl.ocksXXXX
      JupyterXXXXX
      TableauXXXX

    2016 New York R Conference Presentation; RCloud - Collaborative Environment for Visualization and Big Data Analytics


    Top


  • Public Instance versus Local Installation
  • Public Instance

    The public RCloud instance is the perfect place to test drive RCloud. Anyone with a GitHub account can have full user access to this instance. If you don't already have a GitHub account you can create one at the GitHub login page; see the next section Data Scientist Access (or Logging in) for details.

    Local Installation

    A local instance has the advantage of granting you complete control of your notebooks and data. Local instances also afford the opportunity to evaluate and experiment with the RCloud infrastructure. Windows is not supported at this time, but the following describes the requirements for the OSX download:

    Requirements:

    • OS X 10.11 (El Capitan)
    • R 3.3.1
    • XQuartz 2.7.9
    • Java 1.8.0

    Alternatively, you may use our RCloud Docker package.

    Local Enterprise Installation

    Complete detailed instructions for installing a local enterprise RCloud instance may be found on GitHub Setting up (Installing) RCloud.

    Top



  • Anonymous User versus Data Scientist (Logging In) Access
  • Both the public instance of RCloud and any local installation come with two methods for accessing notebooks - Anonymous (unregistered) User access and Data Scientist (registered user) access.

    Anonymous User Access

    Anonymous Users may view and interact with RCloud notebooks without having an RCloud account. This is done by navigating to the RCloud notebook web page using a URL / hyperlink. The purpose of Anonymous User access is to allow Data Scientists to easily share their work as a web page (hyperlink) with non-developers and/or users who do not have an RCloud account at any stage of the development process.

    To find out how a Data Scientist may share their work through hyperlinks, please refer the the Sharing Notebooks tutorial.

    Data Scientist Access

    An RCloud Data Scientist is someone with an RCloud account who can create, edit, fork (copy) and share notebooks. Since RCloud provides automatic version control by storing notebooks as Github gists, creating an RCloud account requires that you have a Github account. If you already have a GitHub account, you may skip to Step 3.

    1. Press the Log In button on the Try It page or type http://rcloud.social/login.R in your browser. RCloud Log In
    2. If you are not already logged into your GitHub account, you will be redirected to GitHub where you can login (Github Sign in). RCloud GitHub Sign In
      ...or you may register for an account if you don't already have one (Github Sign up). RCloud GitHub Sign Up ...if this is a new GitHub account, GitHub requires that you verify your email address. RCloud GitHub Verify Email Address
    3. After signing into your Github account, or registering for one as the case may be, the next step is to authorize RCloud to create gists in your Github account by pressing the Authorize application button.

      Note: For new GitHub accounts, you will need to navigate back to the Log In page to get to the GitHub Authorize application page, or you can find it in your GitHub profile settings.RCloud GitHub Authorization

    4. Once you have authorized the RCloud application, you should be automatically redirected back to the RCloud editing environment where you may begin creating and editing notebooks.

    Anonymous User versus Data Scientist (Logging In) Access Video Tutorial

    Top


Getting Started

  • Introduction to the RCloud Integrated Development Environment (IDE)
  • The RCloud Integrated Development Enviornment is composed of navigation (header) bar, a left windowshade panel, a right windowshade panel and in the center are Prompt (R/Python) and Markdown cells. As an RCloud Data Scientist, you have the ability to create, run (execute), fork, edit and share every notebook in the RCloud Integrated Development Environment (IDE). This access is unique to RCloud and allows Data Scientists to leverage existing work and recreate (reproduce) past work.

    Notebooks may be created using the + (plus) symbol located at the top of the left windowshade panel and run using the play button in the navigation bar. Prompt cells are the equivalent of R or Python command line sessions. Prompt cells may be switched to Markdown cells using the dropdown menu in each cell. Cells make be edited and deleted using the icons associated with each cell.

    View HTML Share

    Entire notebooks may be deleted using the x symbol which appears when you place your mouse next to the notebook name. Likewise, notebooks may be starred ("liked"), made hidden and placed into groups using the respective icons next to the notebook name.

    RCloud Data Scientists may view past versions of notebooks by clicking on the clock icon next to the notebook name: versions are automatically created every time you run or save a notebook. Popular notebooks may be viewed by clicking the Discover menu item in the navigation bar. The RCloud UI functionality in the left and right windowshade panels includes File Upload (covered in more detail in the Data - Loading and Saving tutorial), notebook Comments, Search, Workspace, Dataframe and Session information and Help access. Assets located in the right windowshade panel may be code or images and are access using RCloud API

    Introduction to the RCloud Integrated Development Environment Video Tutorial

    Top


  • Creating, Forking (Copying) and Saving Notebooks
  • To create an RCloud notebook, you must be logged in as an RCloud Data Scientist. Once you are logged in, you may create a notebook using two methods:

    • Using the '+' key at the top of the left windowpane, or
    • RCloud Create Notebook Icon '+'

    • Fork (copy) an existing notebook using the fork icon in the navigation bar.
    • RCloud Fork Notebook Icon

    • Save notebooks by using the save icon in the navigation bar or by running the notebook.
    • RCloud Save Notebook using Save Icon or by Running Notebook using Run Icon

    For example, to get started fork an existing notebook (you must be logged in to access this notebook):


    RCloud Creating Notebooks Video Tutorial

    Top


  • Markdown and RMarkdown
  • RCloud supports Markdown and RMarkdown - select the desired markdown by changing the cell type in the RCloud editing environment:

    RCloud Markdown Cell

    You may access many Markdown and RMarkdown Tutorials on the web, or you can login as an RCloud Data Scientist to view the Markdown code examples which include text and equations:

    RCloud Markdown Sample Notebook 1

    Credit Card Loss Model

    RCloud Markdown Sample Notebook 2

    RCloud API; Access Data

    RCloud Markdown Sample Notebook 3

    STFT of an Audio File


    Top

  • Data - Loading and Saving
  • There are several methods to access data in RCloud including the following:

    1. File upload to Data Scientist home directory.

      Files may be uploaded to a User's home directory by using the GUI interface found in the right windowshade panel.

      RCloud File Upload Example 1

      Note: In this example, the 'Upload to notebook' box is not ticked.

      Since the data is now located in your home directory, you would access the path to the data using the following RCloud API 'rcloud.home()':

      # Use RCloud API to read data
      # Data source: http://www2.census.gov/geo/docs/maps-data/data/rel/zcta_county_rel_10.txt'
      fn1 <- read.csv(rcloud.home('zcta_county_rel_10.txt'), sep=',', colClasses="character")
      summary(fn1)
      						

    2. File upload to RCloud Notebook.

      Files may be uploaded directly to a specific RCloud Notebook by using the same process as in the first method, but also ticking the 'Upload to notebook' box in the GUI.

      RCloud File Upload Example 2

      Note: In this example, the 'Upload to notebook' box is ticked.

      Since the data is now an RCloud 'asset', you would access the path to the data using the following RCloud API 'rcloud.get.asset()':

      # Use RCloud API to read data
      # Data source: http://archive.ics.uci.edu/ml/datasets/Zoo
      fn2 = rcloud.get.asset('zoo_data.txt', as.file=TRUE)      
      t2 = read.table(fn2,sep=",",header=TRUE) 
      						

    3. Drag and drop files to RCloud Assets.

      Data may also be uploaded to RCloud by dragging and dropping files from your local machine to the Asset windowpane; the 'Drop File to Asset' GUI will automatically appear as you drag files.

      RCloud Drag and Drop Data as an Asset

      Note: Using this method, file size is currently limited to 75KB on the public instance of RCloud.



      Since the data is now and RCloud asset, it is referenced using the same method as in #2:

      # Use RCloud API to read data
      # Data source: http://archive.ics.uci.edu/ml/datasets/Zoo
      fn3 = rcloud.get.asset('Wholesale_customers_data.csv', as.file=TRUE);         
      t3 = read.table(fn3,sep=",",header=TRUE)
      						
    4. Manually create an RCloud Asset by typing in data.

      RCloud Data Scientists may also manually enter data as an RCloud Asset. First click on the 'New Asset' tab in the Asset panel:

      RCloud Enter Asset Example 1

      Type a file name:

      RCloud Enter Asset Example 2

      Then either type data in the tab or use keyboard shortcut keys to copy and paste (e.g., Ctrl-A, Ctrl-C, Ctrl-V).

      RCloud Enter Asset Example 3

      Since the data is now an RCloud asset, it is referenced using the same method as in #2:

      # Use RCloud API to read data
      # Data source: http://www.cs.waikato.ac.nz/ml/weka/
      fn4 = rcloud.get.asset('Play_tennis.csv', as.file=TRUE);        
      t4 = read.table(fn4,sep=",",header=TRUE)                  
      						

    There are several methods to save data in RCloud including the following:

    1. Output Data

      Save data in your RCloud home directory by specifying the path:

      oDir  = rcloud.home()
      outFn = paste(oDir,"/outputTest.txt",sep="")
      
      # Write to file
      write.table(t4, outFn)

    2. For writing out to binary, the following code would be used instead:
      # Create the output file; "wb" = write binary
      # f = file( outFn, "wb")
      #
      # Standard R write binary function
      # writeBin(t4,f); 
      
      ### Some useful commands for data processing are:
      1. file.remove(rcloud.upload.path("foo.txt")) 
      2. list.files(rcloud.home()). 
    3. Since Unix user directory may or may not coincide with the RCloud home directory depending on the deployment, rcloud.upload.path() may also be used. For example: 'list.files(rcloud.upload.path())'.

    This information may also be viewed in the RCloud Sample Notebook: Data - Loading and Saving.


    Loading Data into RCloud Video Tutorial


    Top

  • Sharing Notebooks
  • RCloud Data Scientists may obtain a hyperlink (URL) to a notebook they have created by selecting the kind of URL they would like to share with registered and unregistered users. The simplest form of sharing is the view.html option which may be selected in the drop down menu of the share icon in the navigation bar.

    HEADER_SHARE

    Clicking the share icon will produce a web page (URL) that registered users can share with other registered users. RCloud Data Scientists (registered users) may view the underlying code by clicking the edit icon or run the notebook by clicking the play icon in the navigation bar. This means that by default users who wish to view notebooks must be logged into RCloud.

    RCloud Sharing view.html

    However, if the Publish Notebook box is checked in the Advanced Menu (located in the navigation bar), any user who has network access to the notebook's URL will be able to execute (run), view and share the notebook.

    View HTML Share

    RCloud notebooks are not static web pages. Executing a notebook will fetch data and return live / updated results. Unregistered users may view the source code by selected Show Source in the Advanced menu located in the navigation bar. Alternatively, source code may be viewed as a GitHub repository by selecting Open in GitHub in the Advanced menu.

    View HTML Share

    RCloud Introduction to Sharing Video Tutorial

    Top



  • Notebook Protection, Privacy and Security
  • Make a notebook protected (private) by clicking on the "eye" icon next to the notebook name in the left windowpane:

    RCloud Protect Notebook

    Protected notebooks are readable only by the owner and (optionally) a select group of users and will not show up in search results (although previously unprotected versions might).


    Use the second tab of the protection dialog to create/rename groups and/or assign other users as administrators/members of groups you administrate. Alternatively, you can select Manage Groups from the Advanced menu item in the navigation bar — note that the Notebook tab will be grayed out in that case, as Manage Groups is not notebook specific.

    RCloud Group Notebook
    RCloud Manage Groups

    Top

Extending Interaction

  • RCloud shiny.html - Using Shiny on RCloud
  • This tutorial describes how to write and deploy an R Shiny application on RCloud.

    Hosting Shiny applications on RCloud allows you to enjoy the convenience of RCloud development and distribution with the elegant user interface features of Shiny.

    This tutorial assumes you are familiar with the basics of Shiny application development. To learn more about constructing Shiny apps or for a refresher on the Shiny architecture visit the RStudio Shiny Tutorial page.

    1. Create a new blank RCloud notebook by clicking on the '+' button in the left windowshade panel.RCloud Create Notebook
    2. Put the notebook in the Shiny share mode by clicking on the downward arrow in the 'share' icon and selecting 'shiny.html'.RCloud Shiny Tutorial Share Icon
    3. Enter your Shiny sever and user interface code. Here is a basic example:
      library(rcloud.shiny)
      library(shiny)
      
      # Put code here that will run once when the app is loaded
      
      df = iris
      
      # This is your Shiny User Interface (UI) layout
      ui = fluidPage(
          titlePanel("My Shiny RCloud Example"),
          helpText(a("View Source in RCloud UI", target="_blank", href=paste0("/edit.html?notebook=", rcloud.session.notebook.id()))),
          verticalLayout(
            sidebarPanel(
              selectInput("src0","Sepal",c("Length","Width")),
              selectInput("src1","Petal",c("Length","Width")),
              hr(),
              helpText(paste("This data set has",nrow(df),"rows"))
            ),
            mainPanel(
              plotOutput("thePlot")
            )
      
          )
        )
      
      # This is the standard Shiny server function
      server = function(input,output)
      {
          output$thePlot = renderPlot({
      
              x = df[[paste0("Petal.",input$src0)]]
              y = df[[paste0("Sepal.",input$src1)]]
      
              plot(x,y,xlab=input$src0,ylab=input$src1)
      
          })
      }
      
      # Start the Shiny app in your browser.
      rcloud.shinyApp(ui=ui,server=server)
      
      							
    4. Notice that the Shiny server and UI functions are written exactly as they would in a stand-alone Shiny app except they are not wrapped inside shinyUI() and shinyServer() functions. Instead they are simply passed to the rcloud.shinyApp() API.
    5. To run this app first click the play button. Any R syntax errors will be shown in the notebook cell output window. If no errors occur then click the run arrow on the 'share' icon in the editor toolbar. A new tab will open in your browser with the executing Shiny app. Any runtime errors will be shown in the browser just as they would in a stand-alone Shiny app.
    6. Shiny apps may be shared publicaly or privately just like any other RCloud notebook. Read the 'Sharing RCloud Notebooks' tutorial for the details.

    7. View sample Shiny notebooks on the public instance, by logging in as a Data Scientist and navigating to the 'RCloud Sample Notebooks/Dashboarding/RCloud shiny.html' directory or click on the image below to view a Word Cloud example.


  • RCloud mini.html

  • RCloud notebook.R

  • Built-in Javascript Integration

  • Other Javascript Integration; User defined functions and D3.org examples