TII (Falcon)

Falcon 40B

Self-host onlySmall memory

Falcon 40B is TII (Falcon)'s small-memory model. This page shows current pricing, an interactive cost calculator, and a side-by-side with similar models.

Input

β€”

Output

β€”

Cached

β€”

Batch

β€”

Interactive

Calculate your Falcon 40B bill.

Adjust the workload below and watch the monthly cost update in real time.

What would Falcon 40B cost you?

Adjust the workload to see your monthly bill.

1,00010,00050,000250,0001M10M

Technical specifications

Falcon 40B at a glance.

Memory

2,048

tokens

Max reply

β€”

tokens

Memory tier

Small

a few emails or a short document

Tokenizer

default

Released

β€”

Training cutoff

β€”

Availability

Self-host only

Status

active

What it can do

Capabilities & limits.

  • Understands images
  • Deep step-by-step thinking
  • Uses tools / calls functions
  • Strict JSON output
  • Streams replies
  • Fine-tunable on your data

When to pick Falcon 40B

  • High-volume workloads where unit cost matters.

When to look elsewhere

  • You need a managed endpoint β€” this one is self-host only.
  • Your workload involves images β€” pick a vision-capable model instead.
  • You need tool-use / function calling for agent workflows.
  • Your inputs routinely exceed short documents.

FAQ

Falcon 40B β€” the questions we see most.

Pricing, capabilities, alternatives β€” generated from the same data that powers the calculator above.

Get instant answers from our AI agent

Falcon 40B is an open-weight model with no managed endpoint β€” you run it on your own GPUs, so the cost depends on your hardware choice rather than a per-token price. Hosted providers like Together AI, Fireworks, or Replicate offer it from around $0.20–$2.00 per 1M tokens depending on size.
Falcon 40B has a 2,048-token context window (small memory β€” a few emails or a short document). That means you can fit about 384 words of input and history in a single call.
Models in a similar class include Falcon 7B, Falcon 180B, OLMo 3. The "Similar models" section below this FAQ links into each.

Still unsure?

Compare Falcon 40B against 100+ other models.

Open the full wizard β€” pick a use case, set your usage, and see side-by-side monthly costs in under a minute.