david
|
4d88c30242
conflicts resolved
|
2 månader sedan |
Nicholas Carlini
|
a35d306f34
Add O1 option
|
3 månader sedan |
Nicholas Carlini
|
6bb358f269
Add GPT-4o mini result
|
4 månader sedan |
Nicholas Carlini
|
b09d7fede5
Merge pull request #17 from simveit/main
|
5 månader sedan |
Nicholas Carlini
|
d0ecd8c33b
Update with Sonnet 3.5 and Gemini 1.5 Pro results
|
5 månader sedan |
Simon Veitner
|
eae650718a
Added groq to example config
|
5 månader sedan |
Simon Veitner
|
7b972e43ee
Enable support for Groq models
|
5 månader sedan |
Your Name
|
dfea2287a9
I don't have blind faith in LLMs
|
5 månader sedan |
Nicholas Carlini
|
16f8d13900
Add a few more recent tests
|
5 månader sedan |
Nicholas Carlini
|
3ad3cfde3c
Update README
|
7 månader sedan |
Nicholas Carlini
|
be07d215b2
Improve evaluations for several tests
|
7 månader sedan |
Nicholas Carlini
|
2e14f5e5d0
Fix python version errors
|
7 månader sedan |
Nicholas Carlini
|
53d1251d88
A dozen new tests from the last month
|
7 månader sedan |
Nicholas Carlini
|
0d2b4d9e9d
Edit README to say how to generate result figures
|
8 månader sedan |
Nicholas Carlini
|
656a597d01
Add support for incremental builds of results
|
8 månader sedan |
Nicholas Carlini
|
e98bcc1e22
Fix golfing question again
|
8 månader sedan |
Nicholas Carlini
|
5c9a6521e0
Fix 20 questions
|
8 månader sedan |
Nicholas Carlini
|
b0d674b92c
Fix golfing question
|
8 månader sedan |
Nicholas Carlini
|
4e890ca464
Fix webgl draw test
|
8 månader sedan |
Nicholas Carlini
|
38d421bf8c
Code golf questions!
|
8 månader sedan |
david
|
ebeb8876c5
Merge branch 'main' of https://github.com/carlini/yet-another-applied-llm-benchmark
|
8 månader sedan |
Nicholas Carlini
|
49207c3ed7
Add a few new test cases
|
8 månader sedan |
Nicholas Carlini
|
55175af4a1
Update anthropic llm to latest API
|
8 månader sedan |
Nicholas Carlini
|
0e7238803b
Six new tests
|
8 månader sedan |
Nicholas Carlini
|
4b56a1e278
Merge pull request #15 from RyanSaxe/fix/incorrect_hparams
|
8 månader sedan |
RyanSaxe
|
a6c1e2e36a
name written twice
|
8 månader sedan |
RyanSaxe
|
75e11cbf67
I noticed that the hparams in the config file that were being accessed had copy-pasted code that had the wrong llm name. Updated to the right name according to class object
|
8 månader sedan |
david
|
d4a50ae671
custom model support by OpenAI compatible API, and fix program_pipes_cpp, program_pipes_python test
|
8 månader sedan |
david
|
30f36116f6
demo config
|
8 månader sedan |
david
|
551ba34582
remove config.json
|
8 månader sedan |