A Real Connection

Ikigai provides a smooth and secure way to connect to 130+ sources of live data

by Koby Frank

Data Data Everywhere. Ideally, analysts should spend their time doing two things: (1) manipulating raw data, and (2) gaining insight from it. These two steps, broadly speaking, are both interesting and nontrivial. If done well, they are the drivers behind good policy-making or engineering or a healthy supply chain or good purchase order decisions. There is, however, a silent step zero to this process: getting the raw data in the first place. This step is not particularly profound. Though step zero should not take up much of anyone’s time, it can often take most of the time.

Data can live in many places. In fact, the modern data ecosystem is quite fractured. There are “data warehouses” like Snowflake and Amazon Redshift, where spreadsheet-type files are stored in the cloud and indexed for fast retrieval and querying. There are “data lakes” provided by Amazon, Google, Databricks, and other vendors, where information of any sort can be tossed into storage without being filed, allowing for more flexibility and fewer formatting constraints. There is data living locally, spread around on employee computers, and with vendors of all sorts — Google Sheets, Shopify, SAP, Salesforce, bank transactions and Instagram Business and so on.

Extracting data from one of these sources and moving it to a place where you can manipulate and gain insight from it can be particularly cumbersome. In general, it’s a manual process. Maybe your data is not in the correct format to be read into its destination. Maybe it’s updated weekly or daily or hourly, and each time it’s updated someone has to export the data all over again, an exercise that cannot be particularly enjoyable.

What’s a Data Connector? Hence data connectors. Through our data connectors, users have access to over 130 sources of data from within the Ikigai platform. These connectors incrementally sync with the source, providing the user a live linkage between the source of the data and Ikigai, where they can manipulate, visualize, and export the data with human-in-the-loop. Ikigai uses a collection of existing libraries of data sources (e.g., Fivetran and Plaid), as well as internally built-out connections less commonly provided by vendors (e.g., Snowflake), to provide users with an easy interface when pulling in their data. The process usually involves a single step — inputting a few fields about the path to the specific data, and, if necessary, entering your credentials. After that, Ikigai handles the rest. We keep this connection up-to-date, always grabbing the most recent snapshot of your data from the source, and handle all of your permissions to the data in the backend.

Figure 1: Steps to connect a Google Sheets spreadsheet to Ikigai

Of course, there are more than 130 sources of data in the world. For those sources we do not yet support, the more advanced user can create what we call a custom connector — with custom python code or our web-scraping capabilities, users can extract data from anywhere on the internet. Just like with our out-of-the-box connectors, Ikigai handles the incremental loading and syncing of this data to always keep it up-to-date.

Open Authorization. Many of Ikigai’s internally built-out connectors use Open Authorization, or OAuth. OAuth is a protocol for letting applications connect with one another without passing around actual user credentials. For instance, Plaid allows Ikigai to securely get transaction data from thousands of banks. Users authenticate from within Plaid or their own bank’s website, and instead of seeing those credentials, Ikigai receives tokens we show to the bank to prove we are authorized. These tokens are attached to our own client account, meaning nobody but us could use them to authorize themselves with the bank.

For the user, it is as easy as choosing your bank and entering your credentials. After doing so, Ikigai utilizes Plaid to get a full record of transactions made in that bank account, including the amount, date, location, and category involved in each transaction, along with other relevant fields. With this data inside of Ikigai, users can then move to gaining insight from it — tracking a budget, visualizing how much is spent on each category, and so on.

Figure 2: Steps to connect transaction data from your bank to Ikigai

Another application we connect with using OAuth is Snowflake. Snowflake is perhaps the most popular and user-friendly data warehouse on the market. It is, however, not widely available to connect to, or requires users to download and configure database drivers on their local machines. Ikigai connects to Snowflake without this hassle. Similar to Plaid, we use OAuth instead of handling any user credentials.

The following table is a summary of how Ikigai achieves OAuth connection to Plaid and Snowflake. While there is a lot going on in the backend, for the user it can be a sub-sixty-second process.

Figure 3. How Ikigai achieves OAuth connection to Plaid and Snowflake

About the Author

Koby Frank (Backend Software Engineer)

Koby Frank works on building out data connector capabilities, among other features, at Ikigai. He is a recent graduate from the University of Pennsylvania, where he received a Bachelor’s degree in Mathematics and Computer Science.

--

--

--

AI-charged spreadsheets for data operators who run mission-critical biz processes using data.

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Escaping variable names inside other variables in Bash scripts

How to get a technical partner

Binary Search Tree Insertion

Dynamic joins in Spark Streaming applications with Scala

TryHackMe Osquery Walk-Through

LinkedIn DataHub Project Updates

Spark Streaming + ELK

Rethinking Jenkins

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Ikigai Labs

Ikigai Labs

AI-charged spreadsheets for data operators who run mission-critical biz processes using data.

More from Medium

How to Success in Non-hierarchical Team Work

Prison Mike analyzing data

Gut vs Data: How to make decisions as a creative

The future of web development is all about insights-first applications.