How we use Inferable at Inferable

Nadeesha Cabral

We built Inferable because we wanted to use it ourselves. Our initial use cases were centred around customers using it to automate their own infrastructure / diserate SaaS tooling. Sort of like an "intelligent Zapier that runs on-prem".

To that end, we want to demonstrate how we've been using Inferable to manage Inferable. Supporting this is our open-source data-connector, that allows us to easily string together data sources.

Data Connector

A data connector is a context-aware bridge between your on-prem data sources and Inferable Control Plane. It will:

  1. Probe the data source and build context about the domain model.
  2. Allow the user to prompt for requests in natural language.
  3. Uses the context from 1. to iteratively plan and achieve the goal.
sequenceDiagram
    participant User
    participant Control Plane
    participant Data Connector
    participant Database

    Note over Data Connector: Initialization Phase
    Data Connector->>Database: getContext()
    Database-->>Data Connector: Schema & Domain Model
    Data Connector->>Control Plane: Send Context

    Note over Data Connector: Query Phase
    User->>Control Plane: Natural Language Query

    rect rgb(40, 40, 40)
        Note over Control Plane: Planning Phase
        Control Plane->>Control Plane: Plan Query
        Control Plane->>Control Plane: Validate Against Context
    end

    Control Plane->>Data Connector: Send Planned Query

    alt Paranoid Mode Enabled
        Data Connector->>User: Request Query Approval
        User-->>Data Connector: Approve/Reject
    end

    Data Connector->>Database: Execute SQL Query
    Database-->>Data Connector: Query Results

    alt Privacy Mode Enabled
        Data Connector->>User: Raw Results
    else Privacy Mode Disabled
        Data Connector->>Control Plane: Results for Processing
        Control Plane->>User: Processed Results
    end

It uses a builder pattern, and provides a way to register your own "Connectors" by defining a Typescript class. For example, the PostgresConnector:

  1. Reads the table structure to build context about the domain model.
  2. Executes queries against the domain model.

How we ensure security and compliance requirements

  1. Since the data connector is deployed on-prem (we use AWS ECS), the Control Plane, or the end-user (us) never see the credentials.
  2. We use a read-only connection, ensuring that the model can't issue any query that changes the data, even if it hallucinates.
  3. The connection selectively has access to data. Controlled with GRANT SELECT and schema isolation.

Setting up

Our docs contain the quick start which will help you set this up locally, or in your own cloud.

It's just a single docker container that proxies calls to and from your data systems.

How we use it

Arbitrary Data Access

Sometimes someone signs up and does something fishy, which raises an alert, and we don't have all the telemetry in our HyperDX instance to answer it.

At that point, we'd just use our Inferable cluster to run queries like - "What's the user id of the person who owns X cluster" or "Who ran this function".

How many jobs have we executed today

Mildly complex analytics

We use (and love) PostHog. But sometimes, we want to analyse some data from a time before we introduced the analytics.

Rather than backfilling the posthog data, we used Inferable to construct the query from the transactional database.

For example, the following query shows a in depth understanding of the domain model.

User with the most jobs Query generated

Bulk Updates

We sometimes do one off bulk updates (hello json columns!) in our database. But doing updates is risky, so we've turned on the "paranoid mode".

Turning on the "paranoid mode" in the data connector (config.paranoidMode = 1) will make it impossible for any query to issue a database query without getting the human approval first.

Bulk Update Approval Approved Query

Privacy Mode

Privacy mode is a feature that ensure you can keep your data out of the hands of the model providers. It ensures that when the results are returned after executing the query, we don't feed the function result through the model, and instead send it straight through to the end-user in the playground.

The pro is that you have a guarantee that the data returned under privacy mode is not prone to hallucination.

The con is that this data is not available at Inference time for the model to do any analysis.

This is an example of a cluster running in privacy mode.

Privacy Mode

Playground Support

Inferable Playground is an open-source UI that provides a chat / notebook - like experience for you to trigger runs and get restults.

Currently, the Playground supports access to the data in the raw JSON form. However, we're looking forward to introducing charts, graphs and other means of interacting with your data.

Some FAQs

Can the model hallucinate?

Yes, anything that goes through the model is prone to hallucination. Which is why we have privacy mode, which as a side-effect of skipping the model, removes the risk of hallucination.

How does the model know my schema?

The model is able to reason about the schema of the database because we've implemented a data connector for it. The data connector introspects the database and builds a schema of the domain model. For the Postgres connector, this is done by reading the table structure with myPostgres.getContext().

This way, the model is always aware of the domain model, and can make informed decisions.

How can I check the query before the model runs it?

Enabling paranoid mode config.paranoidMode = 1 will make sure you're prompted for approval before executing any query. This is not upto the model to decide, so it's deterministic behaviour encoded in the specificdata connector.

Can I add my own data connectors?

Yes! We've tried to make it as easy as possible to add your own data connectors. Checkout the docs for more information.