Nicholas Carlini
|
16f8d13900
Add a few more recent tests
|
пре 5 месеци |
Nicholas Carlini
|
be07d215b2
Improve evaluations for several tests
|
пре 7 месеци |
Nicholas Carlini
|
2e14f5e5d0
Fix python version errors
|
пре 7 месеци |
Nicholas Carlini
|
53d1251d88
A dozen new tests from the last month
|
пре 7 месеци |
Nicholas Carlini
|
38d421bf8c
Code golf questions!
|
пре 8 месеци |
Nicholas Carlini
|
5b7c71d4c4
Add try/catch to stop failures
|
пре 9 месеци |
Grant Williams
|
b1cfa13350
update RustRun docstring
|
пре 9 месеци |
Nicholas Carlini
|
baec772990
Fix typo in JSON object conversion command; #1 from Evanc123/patch-1
|
пре 9 месеци |
Viswa
|
c09ecb6e9d
update python env varaible
|
пре 9 месеци |
Evan Cater
|
da9d53df54
Update evaluator.py
|
пре 9 месеци |
Nicholas Carlini
|
872a90b3ae
Minor changes
|
пре 10 месеци |
Nicholas Carlini
|
23c2965686
Add podman option
|
пре 10 месеци |
Nicholas Carlini
|
e50ca985fc
A bunch of changes for release
|
пре 10 месеци |
Nicholas Carlini
|
e676a59407
Produce logfile of runs
|
пре 11 месеци |
Nicholas Carlini
|
1d173a1627
Prepare description addition
|
пре 11 месеци |
srxzr
|
521ed3896c
adding preample and also new tests
|
пре 11 месеци |
Nicholas Carlini
|
5656488a16
Add ability for llm to work with interactive processes
|
пре 11 месеци |
Nicholas Carlini
|
e425c714aa
More tests, fixes to models
|
пре 11 месеци |
Nicholas Carlini
|
c1a909f67b
Five new tests
|
пре 11 месеци |
Nicholas Carlini
|
0737c24c6f
Split llms across files, a few new tests
|
пре 11 месеци |
Nicholas Carlini
|
70e5ca5889
A bunch of tests
|
пре 11 месеци |
Nicholas Carlini
|
af7c7c67f7
Add a bunch of evaluators, rewrite the eval interface
|
пре 11 месеци |
Nicholas Carlini
|
f15c042cf2
Initial commit; framework skeleton
|
пре 1 година |