How we use Inferable at Inferable
We built Inferable because we wanted to use it ourselves. Our initial use cases were centred around customers using it to automate their own infrastructure / diserate SaaS tooling. Sort of like an "intelligent Zapier that runs on-prem".
To that end, we want to demonstrate how we've been using Inferable to manage Inferable. Supporting this is our open-source data-connector, that allows us to easily string together data sources.
Data Connector
A data connector is a context-aware bridge between your on-prem data sources and Inferable Control Plane. It will:
- Probe the data source and build context about the domain model.
- Allow the user to prompt for requests in natural language.
- Uses the context from 1. to iteratively plan and achieve the goal.
sequenceDiagram participant User participant Control Plane participant Data Connector participant Database Note over Data Connector: Initialization Phase Data Connector->>Database: getContext() Database-->>Data Connector: Schema & Domain Model Data Connector->>Control Plane: Send Context Note over Data Connector: Query Phase User->>Control Plane: Natural Language Query rect rgb(40, 40, 40) Note over Control Plane: Planning Phase Control Plane->>Control Plane: Plan Query Control Plane->>Control Plane: Validate Against Context end Control Plane->>Data Connector: Send Planned Query alt Paranoid Mode Enabled Data Connector->>User: Request Query Approval User-->>Data Connector: Approve/Reject end Data Connector->>Database: Execute SQL Query Database-->>Data Connector: Query Results alt Privacy Mode Enabled Data Connector->>User: Raw Results else Privacy Mode Disabled Data Connector->>Control Plane: Results for Processing Control Plane->>User: Processed Results end
It uses a builder pattern, and provides a way to register your own "Connectors" by defining a Typescript class. For example, the PostgresConnector:
- Reads the table structure to build context about the domain model.
- Executes queries against the domain model.
How we ensure security and compliance requirements
- Since the data connector is deployed on-prem (we use AWS ECS), the Control Plane, or the end-user (us) never see the credentials.
- We use a read-only connection, ensuring that the model can't issue any query that changes the data, even if it hallucinates.
- The connection selectively has access to data. Controlled with
GRANT SELECT
and schema isolation.
Setting up
Our docs contain the quick start which will help you set this up locally, or in your own cloud.
It's just a single docker container that proxies calls to and from your data systems.
How we use it
Arbitrary Data Access
Sometimes someone signs up and does something fishy, which raises an alert, and we don't have all the telemetry in our HyperDX instance to answer it.
At that point, we'd just use our Inferable cluster to run queries like - "What's the user id of the person who owns X cluster" or "Who ran this function".
Mildly complex analytics
We use (and love) PostHog. But sometimes, we want to analyse some data from a time before we introduced the analytics.
Rather than backfilling the posthog data, we used Inferable to construct the query from the transactional database.
For example, the following query shows a in depth understanding of the domain model.
Bulk Updates
We sometimes do one off bulk updates (hello json columns!) in our database. But doing updates is risky, so we've turned on the "paranoid mode".
Turning on the "paranoid mode" in the data connector (config.paranoidMode = 1
) will make it impossible for any query to issue a database query without getting the human approval first.
Privacy Mode
Privacy mode is a feature that ensure you can keep your data out of the hands of the model providers. It ensures that when the results are returned after executing the query, we don't feed the function result through the model, and instead send it straight through to the end-user in the playground.
The pro is that you have a guarantee that the data returned under privacy mode is not prone to hallucination.
The con is that this data is not available at Inference time for the model to do any analysis.
This is an example of a cluster running in privacy mode.
Playground Support
Inferable Playground is an open-source UI that provides a chat / notebook - like experience for you to trigger runs and get restults.
Currently, the Playground supports access to the data in the raw JSON form. However, we're looking forward to introducing charts, graphs and other means of interacting with your data.
Some FAQs
Can the model hallucinate?
Yes, anything that goes through the model is prone to hallucination. Which is why we have privacy mode, which as a side-effect of skipping the model, removes the risk of hallucination.
How does the model know my schema?
The model is able to reason about the schema of the database because we've implemented a data connector for it. The data connector introspects the database and builds a schema of the domain model. For the Postgres connector, this is done by reading the table structure with myPostgres.getContext()
.
This way, the model is always aware of the domain model, and can make informed decisions.
How can I check the query before the model runs it?
Enabling paranoid mode config.paranoidMode = 1
will make sure you're prompted for approval before executing any query. This is not upto the model to decide, so it's deterministic behaviour encoded in the specificdata connector.
Can I add my own data connectors?
Yes! We've tried to make it as easy as possible to add your own data connectors. Checkout the docs for more information.