Simon Couch - Posit, PBC
Open Source Group, R / LLMs
Today, we’re talking about the LHS⬅️
Meet ellmer!🐘
These are the same:
Your turn! Create a chat object and say “hey!”
chat_github()
might “just work”chat_anthropic()
using instructions linked below03:00
Your turn: adjust the system prompt to the model so that, when supplied a question like “What’s 2+2?”, the model returns only the answer as a word (rather than a digit), no punctuation or exposition.
03:00
You can get a lot of mileage out of the system prompt:
What if that wasn’t super difficult to interface with?
OpenAI and Anthropic have two main ways they make money from their models:
API usage is “pay-as-you-go” by token:
un|con|ventional
Here’s the pricing per million tokens for some common models:
Name | Input | Output |
---|---|---|
GPT 4o | $3.75 | $15.00 |
GPT 4o-mini | $0.15 | $0.60 |
Claude 4 Sonnet | $3.00 | $15.00 |
To put that into context, the source code for these slides so far is 650 tokens.
If I input them to GPT 4o:
\[ 650 \text{ tokens} \times \frac{\$3.75 }{1,000,000~\text{tokens}} = \$0.00244 \]
library(ggplot2)
library(modeldata)
stackoverflow
#> # A tibble: 5,594 × 21
#> Country Salary YearsCodedJob OpenSource Hobby CompanySizeNumber Remote
#> <fct> <dbl> <int> <dbl> <dbl> <dbl> <fct>
#> 1 United Kingdom 1 e5 20 0 1 5000 Remote
#> 2 United States 1.3 e5 20 1 1 1000 Remote
#> 3 United States 1.75e5 16 0 1 10000 Not r…
#> # ℹ 5,590 more rows
#> # ℹ 14 more variables: CareerSatisfaction <int>, Data_scientist <dbl>, …
If I type “plot salary vs experience”, what information does the model need access to complete that request?
If I type “plot salary vs experience”, what information does the model need access to?
The first two can be inferred from the source code, but the third requires access to your R session.
A “provider” is a service that hosts models on an API.
In ellmer, each provider has its own chat_*()
function, like chat_github()
or chat_anthropic()
chat_github()
serves some popular models, like GPT-4o, for “free”
chat_openai()
chat_anthropic()
serves Claude Sonnet
You can be your own “provider”, too:
chat_ollama()
uses a model that runs on your laptop
Many organizations have private deployments of models set up for internal, secure use. ellmer supports the common ones.
Ask around to see if this is the case at NOAA/NASA!
github.com/simonpcouch/openscapes-25