Skip to main content

LLMs

Features (natively supported)

All LLMs implement the Runnable interface, which comes with default implementations of all methods, ie. invoke, batch, stream, map. This gives all LLMs basic support for invoking, streaming, batching and mapping requests, which by default is implemented as below:

  • Streaming support defaults to returning an AsyncIterator of a single value, the final result returned by the underlying LLM provider. This obviously doesn't give you token-by-token streaming, which requires native support from the LLM provider, but ensures your code that expects an iterator of tokens can work for any of our LLM integrations.
  • Batch support defaults to calling the underlying LLM in parallel for each input. The concurrency can be controlled with the maxConcurrency key in RunnableConfig.
  • Map support defaults to calling .invoke across all instances of the array which it was called on.

Each LLM integration can optionally provide native implementations for invoke, streaming or batch, which, for providers that support it, can be more efficient. The table shows, for each integration, which features have been implemented with native support.

ModelInvokeStreamBatch
AI21
AlephAlpha
AzureOpenAI
CloudflareWorkersAI
Cohere
Fireworks
GooglePaLM
HuggingFaceInference
LlamaCpp
Ollama
OpenAI
OpenAIChat
Portkey
Replicate
SageMakerEndpoint
Writer
YandexGPT

All LLMs

LabelDescription
AI21You can get started with AI21Labs' Jurassic family of models, as well...
AlephAlphaLangChain.js supports AlephAlpha's Luminous family of models. You'll ...
AWS SageMakerEndpointLangChain.js supports integration with AWS SageMaker-hosted endpoints...
Azure OpenAI[Azure
BedrockAmazon Bedrock is a fully managed
ChromeAIThis feature is experimental and is subject to change.
Cloudflare Workers AIThis will help you get started with Cloudflare Workers AI [text
CohereThis will help you get started with Cohere completion models (LLMs)
Deep InfraLangChain supports LLMs hosted by Deep Infra through the DeepInfra wr...
FireworksFireworks AI is an AI inference platform to run
FriendliFriendli enhances AI application performance and optimizes cost savin...
(Legacy) Google PaLM/VertexAIThe Google PaLM API is deprecated and will be removed in 0.3.0. Pleas...
Google Vertex AIGoogle Vertex is a service that
Gradient AILangChain.js supports integration with Gradient AI. Check out Gradien...
HuggingFaceInferenceHere's an example of calling a HugggingFaceInference model as an LLM:
Layerup SecurityThe Layerup Security integration allows you to secure your calls to a...
Llama CPPOnly available on Node.js.
MistralAIMistral AI is a platform that offers hosting for
NIBittensorThis module has been deprecated and is no longer supported. The docum...
OllamaThis will help you get started with Ollama [text completion models
OpenAIOpenAI is an artificial
PromptLayer OpenAIThis module has been deprecated and is no longer supported. The docum...
RaycastAINote: This is a community-built integration and is not officially sup...
ReplicateHere's an example of calling a Replicate model as an LLM:
Together AIYou are currently on a page documenting the use of Together AI models...
WatsonX AILangChain.js supports integration with IBM WatsonX AI. Checkout Watso...
WriterLangChain.js supports calling Writer LLMs.
YandexGPTLangChain.js supports calling YandexGPT LLMs.

Was this page helpful?


You can also leave detailed feedback on GitHub.