compiled language for data & ai · apache 2.0

open source · apache 2.0

The language built for data & ai

Python made data accessible. TL makes it fast, safe, and intelligent - in one compiled language. 1,322 tests passing across 34 implementation phases - AI agents with tool-use, full MCP ecosystem (client + server), generics, pattern matching, Python FFI, LLVM & WASM backends, package manager, full LSP, and a comprehensive security audit already shipping.

▸ launch TL ↗ ▸ view on github documentation

1,322 tests passing rust-powered zero GIL

pipeline.tl

source users = postgres("db").table("users") -> User transform active_users(src: table<User>) -> table<User> { src |> filter(is_active == true) |> clean(nulls: { name: "unknown" }) |> with { tenure = today() - signup_date } } model churn = train xgboost { data: active_users(users) target: "is_active" features: [tenure, monthly_spend] }

cloud workspace

TL Workspace - build & run in the browser

A cloud workspace for data and AI workflows. Connect data, explore it, build pipelines and AI agents, and ship them as live HTTP endpoints - all in the browser, zero setup. Live at tl.thinkingdbx.com.

▶ Watch the demo on YouTube.

~/tl-workspace tl@cloud

$ls -la ~/tl-workspace

drwxr-xr-xconnections/ 17+ data sources · guided forms · encrypted creds→

└─PostgreSQL MySQL Snowflake BigQuery Redshift ClickHouse MongoDB Kafka S3 DuckDB Iceberg GraphQL + more

drwxr-xr-xpipelines/ ETL on cron · run history · data lineage→

drwxr-xr-xagents/ workspace tools · bring-your-own LLM · no lock-in→

drwxr-xr-xnotebooks/ interactive TL · shareable public links→

-rwxr-xr-xquery.editor SQL on any source · grid preview · column profile→

drwxr-xr-xdeployments/ HTTP endpoints · token auth · versioned · rollback→

drwxr-xr-xteams/ RBAC · invites · domain auto-join · audit log→

drwxr-xr-xtree/ pipelines · agents · notebooks · deploys, by-folder + by-type→

$8 capabilities · zero install · browser-based ▮

you

which region drove the most revenue last quarter?

→ connections/postgres · running query ✓

ai · data agent

APAC led at $1.24M (+22% QoQ). Singapore was the top market; churn held at 3.1%.

Zero install

browser-based, start in seconds

17+ connectors

guided forms, encrypted credentials

Bring-your-own AI

any LLM provider, no lock-in

Free → Enterprise

Paddle billing · 14-day money-back

Secure by design: encrypted secrets, sandboxed execution, tenant isolation, and a full audit trail. Built on the open-source TL engine - also available via npm, crates.io, and as a WASM playground.

▸ open workspace ↗

philosophy

Seven principles that define TL

01
DATA IS A TYPE
Tables, Streams, Tensors are native types in the language.
02
PIPELINES ARE PROGRAMS
ETL/ELT flows are composable first-class constructs.
03
AI IS A VERB
train, predict, embed, agent are keywords - not libraries.
04
PARALLEL BY DEFAULT
No GIL, automatic partitioning across cores.
05
FAIL LOUD, RECOVER SMART
Built-in error handling for unreliable data sources.
06
READABLE BEATS CLEVER
Python-like readability, Rust-like safety guarantees.
07
FAST WITHOUT TRYING
Compiled to native code with lazy evaluation. Performance is the default, not an afterthought.

the problem

Replace your entire stack

Stop duct-taping together a dozen tools. TL unifies the modern data stack into one language.

today's stack	TL equivalent
Python + Pandas	Native `table` type
SQL in strings	Native query syntax
Spark / PySpark	Built-in distributed execution
Airflow / Dagster	Native `pipeline` construct
PyTorch / TF / sklearn	Native `model` / `train` / `predict`
Kafka consumers	Native `stream` type
dbt	Native transformations with typing
Docker + K8s	`tl deploy` CLI command
LangChain / CrewAI / AutoGen	Native `agent` construct with tool-use
Custom MCP integrations	Built-in `mcp_connect()` + `mcp_serve()`

syntax

Clean, expressive, powerful

schema & source

schema User { id: int64 name: string email: string signup_date: date is_active: bool } source users = postgres("db") .table("users") -> User

transform & pipeline

transform clean_users(src: table<User>) { src |> filter(is_active == true) |> clean(nulls: { name: "unknown" }) |> with { tenure = today() - signup_date } } pipeline daily_etl { users |> clean_users }

ai training

model churn_predictor = train xgboost { data: clean_users(users) target: "is_active" features: [tenure, monthly_spend] split: 0.8 } // Use the model let result = predict(churn_predictor, new_user)

pattern matching

match load_users("data.csv") { Ok(users) => process(users) Err(DataError::FileNotFound(p)) => log("Missing: {p}") Err(e) => alert("{e}") } // Destructuring + guards let [head, ...tail] = items let Point { x, y } = origin

generics & traits

fn top_n<T: Comparable>( data: table<T>, col: fn(T) -> float64, n: int ) -> table<T> { data |> sort(col, desc) |> limit(n) } trait Connectable { fn connect() -> result<Conn, Error> }

ai agents

AI agents as language primitives

No frameworks. No glue code. Define autonomous AI agents with tool-use, multi-provider LLM support, and lifecycle hooks - all with a single keyword.

research_agent.tl

// Define tool functions in pure TL fn search(query) { let resp = http_request("GET", "https://api.search.com/v1?q=" + query, none, none) json_parse(resp.body) } // Declare the agent agent research_bot { model: "gpt-4o", system: "You are a research assistant.", tools { search: { description: "Search the web", parameters: { type: "object", properties: { query: { type: "string" } } } } }, max_turns: 5, on_tool_call { println("[LOG] " + tool_name) } } // Run it let result = run_agent(research_bot, "What is quantum computing?") println(result.response)

01
First-Class Keyword
agent is a language keyword, not a library import. Tools are TL functions wired directly to the LLM.
02
Any LLM Provider
OpenAI, Anthropic, Ollama, or any OpenAI-compatible endpoint. Auto-detects protocol from model name. One base_url field to switch.
03
Automatic Tool Loop
The runtime handles multi-turn tool calling, JSON arg conversion, and result formatting. You just write the function.
04
Lifecycle Hooks
on_tool_call and on_complete blocks for logging, metrics, or custom logic at each step.
05
Pipeline Integration
Agents use the same table, stream, and connectors as your data pipelines - no serialization layer needed.
06
Conversation Persistence
run_agent(agent, message, history) maintains context across multi-turn sessions. No external memory store needed.
07
SSE Streaming
stream_agent() delivers real-time token-by-token output via Server-Sent Events.
08
Retry & JSON Mode
Automatic exponential backoff on 429/5xx errors. output_format: "json" for guaranteed structured output.

run_agent(agent, msg, history)

multi-turn agent with conversation persistence

stream_agent(agent, msg)

real-time SSE streaming responses

embed(text) → tensor

vector embeddings + similarity search

mcp ecosystem

Full MCP ecosystem

Model Context Protocol - the open standard that lets AI tools and data systems talk to each other. TL implements both sides: connect to any MCP server as a client, or expose TL functions to any AI tool as a server.

mcp_client.tl

// Connect to any MCP server let github = mcp_connect("github-server") // Discover available tools let tools = mcp_list_tools(github) // Call any tool directly let issues = mcp_call_tool(github, "list_issues", { repo: "myorg/myrepo" }) // Read resources from server let schema = mcp_read_resource(github, "repo://myorg/myrepo/schema") // Get prompt templates let prompt = mcp_get_prompt(github, "summarize_pr", { pr: "42" }) mcp_disconnect(github)

mcp_server.tl

// Expose TL functions as MCP tools fn query_sales(region, quarter) { postgres("warehouse", "sales") |> filter(region == region) |> filter(q == quarter) } fn run_report(name) { read_csv("reports/" + name + ".csv") |> aggregate(sum(revenue)) } // Start the MCP server mcp_serve() // Claude Desktop, Cursor, Windsurf etc. // can now discover & call these functions // Config: { "command": "tl run server.tl" }

01
MCP Client
mcp_connect() auto-detects stdio or HTTP transport. 10 builtins for tools, resources, prompts, and ping.
02
MCP Server
mcp_serve() turns TL functions into MCP tools. Claude Desktop, Cursor, Windsurf can discover and call them.
03
Agent Integration
mcp_servers: [...] in agent definitions. LLM sees one unified tool list - MCP and native tools dispatched transparently.
04
Sampling
MCP servers can request LLM completions back through TL. Bidirectional AI communication over the protocol.

agent_with_mcp.tl

// Connect to multiple MCP servers let github = mcp_connect("github-server") let db = mcp_connect("database-server") // Agent auto-discovers tools from all MCP servers agent ops_bot { model: "claude-sonnet-4-20250514", system: "You are a DevOps assistant.", mcp_servers: [github, db], tools { // Native TL tools alongside MCP tools deploy: { description: "Deploy to production", parameters: { type: "object", properties: { service: { type: "string" } } } } } } // LLM sees unified tool list: GitHub + DB + native let result = run_agent(ops_bot, "Check open PRs, verify staging DB, then deploy")

11 builtins

BuiltinId 216–226 - connect, discover, call, read, ping, disconnect

2 transports

stdio (subprocess) + Streamable HTTP (reqwest/axum)

rmcp 1.1 SDK

feature-gated: --features mcp

sandbox-aware

sandbox blocks subprocess spawning, HTTP always allowed

One protocol, both directions. TL agents gain access to the entire MCP ecosystem - filesystem, GitHub, databases, Slack, and hundreds more - without building each integration natively. Any AI tool gains access to TL's data engine via MCP server. Connect or serve. Your choice.

type system

Data types as first-class citizens

01

table<T>

Columnar, lazy-evaluated, and partitionable. The core data type for batch processing.

let users: table<User> = postgres("db").table("users")
02

stream<T>

Infinite, windowed, real-time. For continuous data processing and event streams.

stream process_events { from: kafka("events") window: tumbling(5m) }
03

tensor<dtype, shape>

N-dimensional arrays for AI and machine learning. Shape-checked at compile time.

let embeddings: tensor<float32, [256, 768]>
04

model

A trained AI model as a first-class value. Serialize, version, deploy natively.

model churn = train xgboost { ... }
05

agent

Autonomous AI agent with tool-use, MCP server integration, and lifecycle hooks.

agent bot { model: "gpt-4o", mcp_servers: [...], tools { ... } }

memory safety

Four rules. Zero data races.

Rust-inspired ownership without lifetime annotations. The compiler guarantees memory safety and data-race freedom at compile time.

01

Every value has one owner

let users = load("users.parquet") // `users` is the sole owner
02

Pipe |> moves ownership

let active = users |> filter(age > 25) // `users` is now consumed
03

Clone or borrow for reuse

let copy = users.clone() let ref = &users // read-only
04

Parallel partitions own data

parallel for shard in users.partition(by: "region") // No locks needed - compiler guarantees it

under the hood

Six-stage compilation pipeline

Lexer → Parser → Semantic Analysis → TL-IR → Optimization → Code Generation

backend targets

LLVM

Native

Cranelift

JIT

WASM

Web

CUDA

GPU

Built entirely in Rust. TL-IR doubles as a query plan - enabling data-aware optimizations like predicate pushdown, column pruning, and join reordering.

benchmarks

Performance that speaks

CSV Parse 1B rows

Python

45s

<4s

Filter + Aggregate

Python

30s

<2s

ETL Pipeline

Python

5min

<30s

Stream Processing

Python

10K/s

500K/s

Cold Start

Python

3–5s

<100ms

end-to-end ml pipeline

TL's compiler sees the entire pipeline as one program - eliminating serialization boundaries between tools.

Python Stack (~275s)
pandas.read_csv()45s
DataFrame transforms30s
df.to_numpy()5s
xgboost.train()120s
model.predict()15s
pandas.to_sql()60s

TL Pipeline (~120s)
load + filter + with4s
train xgboost { ... }110s
predict + with + save6s
Speedup~2.3x
Zero-copy Arrow handoff. No serialization boundaries.

Targets based on architecture analysis. Benchmarks will be published with reproducible scripts.

error handling

Data is messy. TL handles it.

Rust-inspired result<T, E> with data-specific error types and declarative cleaning - not try/catch bolted on as an afterthought.

result types & pattern match

fn load_users(path: string) -> result<table<User>, DataError> { let raw = read_csv(path)? let valid = raw |> validate_schema(User)? Ok(valid) } match load_users("data.csv") { Ok(users) => process(users) Err(DataError::SchemaViolation(d)) => alert("Drift: {d}") Err(e) => log("{e}") }

declarative data cleaning

let users = load("raw_users.csv") |> clean { nulls: { name: fill("UNKNOWN") email: drop_row age: fill(median) } duplicates: dedupe(by: email) outliers: { age: clamp(0, 150) } } |> validate { assert null_rate(email) == 0.0 assert unique(id) }

connectors

Connect to everything

First-class connectors for databases, object storage, message queues, and APIs. All type-safe and schema-aware.

PostgreSQL

shipped

MySQL

shipped

SQLite

shipped

DuckDB

shipped

Redshift

shipped

MSSQL

shipped

Snowflake

shipped

BigQuery

shipped

Databricks

shipped

ClickHouse

shipped

MongoDB

shipped

Kafka

shipped

AWS S3

shipped

Redis

shipped

GraphQL

shipped

HTTP/REST

shipped

Parquet/CSV

shipped

SFTP/SCP

shipped

Apache Iceberg

shipped · view example

20 connectors shipped - more coming with every release.

python interop

Use any Python library

Bidirectional Python FFI via pyo3. Import Python modules, call functions, convert tensors to NumPy - all from TL code.

interop.tl

// Import any Python library let np = py_import("numpy") let pd = py_import("pandas") let sklearn = py_import("sklearn.metrics") // Call Python functions with TL values let score = py_call( sklearn.accuracy_score, y_true, y_pred ) // TL Tensor <-> NumPy ndarray let pi = np.pi let arr = np.sqrt(16)

01
Bidirectional Conversion
int, float, string, bool, list, map, set - all auto-converted between TL and Python.
02
Tensor ↔ NumPy
TL tensors convert seamlessly to/from NumPy ndarrays for ML workflows.
03
Dot Notation Access
Use natural math.sqrt(16) syntax on Python objects via method dispatch.
04
Feature-Gated
Python FFI is opt-in via feature flag. Zero overhead when not used.

developer experience

Everything you need, built in

terminal

$ tl init my-project Created my-project/ with tl.toml $ tl build Compiled 12 modules in 0.3s $ tl test 1,322 tests passed, 0 failed $ tl add kafka-connector Added kafka-connector v0.8 $ tl fmt && tl lint && tl check Formatted 12 files. No warnings. Types OK. $ tl debug pipeline.tl Breakpoint hit at pipeline.tl:42 dbg> inspect rows → 1,248 records $ tl doc src/ --public-only Generated docs for 8 public modules $ tl deploy pipeline.tl --target k8s Deployed to cluster: prod-east

01
VS Code Extension & LSP
Syntax highlighting, diagnostics, go-to-definition, hover docs, document symbols, rename refactoring, and find-references across files.
02
Package Manager
tl add, tl update, tl outdated - full dependency management with lockfile and transitive resolution.
03
Formatter, Linter & Type Checker
tl fmt, tl lint, tl check - AST-guided formatting, naming conventions, and compile-time type safety.
04
Doc Generation
tl doc generates HTML, Markdown, or JSON docs from /// doc comments with cross-references.
05
Interactive Step Debugger
tl debug - breakpoints, variable inspection, source listing, and stack traces. Debug pipelines interactively.
06
Data Inspection & Lineage
tl inspect, tl profile, tl lineage - preview data, statistical profiles, and lineage graphs.

standard library

Batteries included

Rich standard library methods, native DateTime, window functions, and data engineering primitives - all built in.

15+ List Methods

findsort_bygroup_byuniqueflattenchunkzipeach

Map & String Methods

mergeentriesmap_valuestrim_startis_numericstrip_prefixcount

Math & Randomization

expsignclampis_nanrandom()random_int()sample()

Native DateTime

First-class VmValue::DateTime type with full arithmetic.

today()date_add()date_diff()date_trunc()date_extract()

Window Functions

DataFusion UDWF-backed analytics on tables.

rankrow_numberdense_ranklagleadntile

Table Operations

Pipeline-native data manipulation.

table1 |> union(table2) table |> sample(100) table |> sample(fraction: 0.1) assert_table_eq(t1, t2)

11 MCP Builtins

Full Model Context Protocol client + server.

mcp_connectmcp_list_toolsmcp_call_toolmcp_read_resourcemcp_get_promptmcp_servemcp_pingmcp_disconnect

"""...""" Triple-quoted strings with automatic dedentation.

positioning

How TL compares

The only language where data pipelines, SQL-like queries, ML training, AI agents, MCP ecosystem, and real-time streaming are all first-class features - not libraries.

tool	their strength	TL's advantage
Python	Largest ML/data ecosystem	10-50× faster, type-safe, compiled
Mojo	Compiled ML, Python superset	Better data engineering: pipelines, streaming, connectors
Rust	Max performance, memory safety	Domain-specific abstractions as primitives
DuckDB	Embedded analytics, great SQL	Full language, not just SQL - plus ML and streaming
Polars	Fastest DataFrame library	First-class syntax, integrated ML/streaming
Scala + Spark	Battle-tested distributed computing	Simpler syntax, faster single-node, no JVM
SQL / dbt	Declarative, universally understood	Full programming language + AI + streaming
LangChain / CrewAI	Rich agent ecosystem, Python flexibility	Native syntax, no Python dep, compiled speed, type-safe tools
Custom MCP SDKs	Protocol-level flexibility	Client + server built-in, agent-integrated, zero config

open source

Open source.

ThinkingLanguage is licensed under Apache 2.0.

▸ launch TL ↗ ▸ star on github documentation

Apache 2.0 Built with Rust

- pull requests welcome ✎

The language built for data & ai

DATA IS A TYPE

PIPELINES ARE PROGRAMS

AI IS A VERB

PARALLEL BY DEFAULT

FAIL LOUD, RECOVER SMART

READABLE BEATS CLEVER

FAST WITHOUT TRYING

First-Class Keyword

Any LLM Provider

Automatic Tool Loop

Lifecycle Hooks

Pipeline Integration

Conversation Persistence

SSE Streaming

Retry & JSON Mode

MCP Client

MCP Server

Agent Integration

Sampling

table<T>

stream<T>

tensor<dtype, shape>

model

agent

Every value has one owner

Pipe |> moves ownership

Clone or borrow for reuse

Parallel partitions own data

Bidirectional Conversion

Tensor ↔ NumPy

Dot Notation Access

Feature-Gated

VS Code Extension & LSP

Package Manager

Formatter, Linter & Type Checker

Doc Generation

Interactive Step Debugger

Data Inspection & Lineage

15+ List Methods

Map & String Methods

Math & Randomization

Native DateTime

Window Functions

Table Operations

11 MCP Builtins

Open source.

`table<T>`

`stream<T>`

`tensor<dtype, shape>`

`model`

`agent`

Pipe `|>` moves ownership