Ready to get started?

Learn more about the CData JDBC Driver for Trello or download a free trial:

Download Now

Work with Trello Data in Apache Spark Using SQL

Access and process Trello Data in Apache Spark using the CData JDBC Driver.

Apache Spark is a fast and general engine for large-scale data processing. When paired with the CData JDBC Driver for Trello, Spark can work with live Trello data. This article describes how to connect to and query Trello data from a Spark shell.

The CData JDBC Driver offers unmatched performance for interacting with live Trello data due to optimized data processing built into the driver. When you issue complex SQL queries to Trello, the driver pushes supported SQL operations, like filters and aggregations, directly to Trello and utilizes the embedded SQL engine to process unsupported operations (often SQL functions and JOIN operations) client-side. With built-in dynamic metadata querying, you can work with and analyze Trello data using native data types.

Install the CData JDBC Driver for Trello

Download the CData JDBC Driver for Trello installer, unzip the package, and run the JAR file to install the driver.

Start a Spark Shell and Connect to Trello Data

  1. Open a terminal and start the Spark shell with the CData JDBC Driver for Trello JAR file as the jars parameter: $ spark-shell --jars /CData/CData JDBC Driver for Trello/lib/cdata.jdbc.trello.jar
  2. With the shell running, you can connect to Trello with a JDBC URL and use the SQL Context load() function to read a table.

    Trello uses token-based authentication to grant third-party applications access to their API. When a user has granted an application access to their data, the application is given a token that can be used to make requests to Trello's API.

    Trello's API can be accessed in 2 different ways. The first is using Trello's own Authorization Route, and the second is using OAuth1.0.

    • Authorization Route: At the moment of registration, Trello assigns an API key and Token to the account. See the Help documentation for information on how to connect via the Authorization route.
    • OAuth Route: Similar to using Authorization, OAuth creates an Application Id and Secret when you create your account. See the Help documentation for information on how to to connect.

    Built-in Connection String Designer

    For assistance in constructing the JDBC URL, use the connection string designer built into the Trello JDBC Driver. Either double-click the JAR file or execute the jar file from the command-line.

    java -jar cdata.jdbc.trello.jar

    Fill in the connection properties and copy the connection string to the clipboard.

    scala> val trello_df = spark.sqlContext.read.format("jdbc").option("url", "jdbc:trello:APIKey=myApiKey;Token=myGeneratedToken;").option("dbtable","Boards").option("driver","cdata.jdbc.trello.TrelloDriver").load()
  3. Once you connect and the data is loaded you will see the table schema displayed.
  4. Register the Trello data as a temporary table:

    scala> trello_df.registerTable("boards")
  5. Perform custom SQL queries against the Data using commands like the one below:

    scala> trello_df.sqlContext.sql("SELECT BoardId, Name FROM Boards WHERE Name = Public Board").collect.foreach(println)

    You will see the results displayed in the console, similar to the following:

Using the CData JDBC Driver for Trello in Apache Spark, you are able to perform fast and complex analytics on Trello data, combining the power and utility of Spark with your data. Download a free, 30 day trial of any of the 160+ CData JDBC Drivers and get started today.