Join the Shiny Community every month at Shiny Gatherings

Posit x Databricks: A Game-Changing Synergy for Data Teams

On Thursday, December 5th, 2023, Posit and Databricks held a joint event, revealing several key developments in their ongoing collaboration.

These updates, building on announcements from July 2023, showcase tangible progress in data analysis and cloud computing integration. James Blair, Product Manager of Cloud Integrations for Posit, alongside Rafi Kurlansik, Lead Product Specialist, took the stage to unveil these updates.

This article provides an overview of the event’s highlights and a glimpse into the future of data science with Posit and Databricks.

Table of Contents

Highlights of the Presentation

The event, aimed at data scientists and cloud computing professionals, was divided into two main segments: Databricks’ latest capabilities and Posit’s new integrations.

Databricks’ Data Intelligence Platform

The first segment spotlighted Databricks’ decision to bolster their Data Intelligence Platform, including an introduction to the Databricks Connect V2, a lightweight client library to run a remote connection to Databricks, and its integration with Visual Studio Code as a plugin.

A particularly interesting use case showcased was the integration of large language models (LLMs) hosted on Databricks to interpret and answer data queries in natural language, a very interesting step forward in data accessibility and analysis.

This direction for Databricks’ platform and Posit’s integration are expected to streamline various data processes and enhance the efficiency of data teams.

Posit’s Integration with Databricks

The second part, which forms the crux of this article, focused on Posit’s advancements, specifically their integration with Databricks in the RStudio Pro IDE available on Workbench.

Learn how Posit Connect can revolutionize remote data science team operations by exploring our detailed analysis.

New Developments in Posit Workbench

New features exclusive to Workbench, expected to be released by the end of the month, were introduced. These features are tailored to enhance the interaction with Databricks within the IDE, indicating a strategic focus on improving user experience and functionality:

  • Credential Management: Simplified authentication processes to seamlessly access Databricks resources from Posit Workbench (no more PATs [Personal Access Tokens]!).
  • New Databricks Pane: A new view into the Databricks compute console where you can view information, start or stop clusters available on the platform, and connect to them.
  • Automatic SparklyR dependency management: Workbench will manage the necessary virtual environments required to connect to the Databricks environment.
  • Unity Catalog view: On the Connections pane, you will be able to browse through the Unity Catalog with an ODBC (Open Database Connectivity) or SparklyR connection.

Enhancements in Open Source

A new iteration of the ODBC package is set to be released on CRAN shortly, introducing a databricks function designed to utilize a Databricks driver for more efficient connectivity to the Databricks environment.

Additionally, Posit is gearing up to provide its own ODBC driver specifically for Databricks, supplementing the existing range of drivers they offer.

These enhancements, along with significant updates to the SparklyR project to support the new Databricks Connect v2, and new Databricks runtimes, are aimed at streamlining workflows, enhancing compatibility, and boosting performance with Databricks services.

Curious about the transformative impact of open-source R/Shiny tools on pharma? Check out this article on  How Open Source (R and Shiny) Is Transforming Processes in the Pharmaceutical Industry.

The Future: Posit Products in Lakehouse Applications

Posit is setting its sights on incorporating their suite of products into the Lakehouse architecture framework. This strategic move is designed to maintain data within the secure confines of Databricks, while simultaneously empowering data scientists with the ability to effortlessly access, analyze, and share data.

Such an integration marks a pivotal advancement in the realms of data governance, security, and management, reflecting Posit’s commitment to evolving with the changing dynamics of data science.

Practical Implications and Future Directions

The collaboration between Posit and Databricks opens up new avenues for data scientists and engineers, particularly those working with R.

These updates not only streamline workflows but also enhance data security, governance, and scalability. This also makes creating Shiny applications that utilize Databricks as a Data Source (or compute source) more appealing and within reach.

As the data science landscape continues to evolve, the Posit and Databricks partnership is poised to play a significant role in shaping future trends and technologies.

Did you find this information useful? Stay informed with the latest R/Shiny developments by subscribing to Shiny Weekly.