Nicholas Carlini
|
16f8d13900
Add a few more recent tests
|
hai 5 meses |
Nicholas Carlini
|
be07d215b2
Improve evaluations for several tests
|
hai 7 meses |
Nicholas Carlini
|
2e14f5e5d0
Fix python version errors
|
hai 7 meses |
Nicholas Carlini
|
53d1251d88
A dozen new tests from the last month
|
hai 7 meses |
Nicholas Carlini
|
38d421bf8c
Code golf questions!
|
hai 8 meses |
Nicholas Carlini
|
5b7c71d4c4
Add try/catch to stop failures
|
hai 9 meses |
Grant Williams
|
b1cfa13350
update RustRun docstring
|
hai 9 meses |
Nicholas Carlini
|
baec772990
Fix typo in JSON object conversion command; #1 from Evanc123/patch-1
|
hai 9 meses |
Viswa
|
c09ecb6e9d
update python env varaible
|
hai 9 meses |
Evan Cater
|
da9d53df54
Update evaluator.py
|
hai 9 meses |
Nicholas Carlini
|
872a90b3ae
Minor changes
|
hai 10 meses |
Nicholas Carlini
|
23c2965686
Add podman option
|
hai 10 meses |
Nicholas Carlini
|
e50ca985fc
A bunch of changes for release
|
hai 10 meses |
Nicholas Carlini
|
e676a59407
Produce logfile of runs
|
hai 11 meses |
Nicholas Carlini
|
1d173a1627
Prepare description addition
|
hai 11 meses |
srxzr
|
521ed3896c
adding preample and also new tests
|
hai 11 meses |
Nicholas Carlini
|
5656488a16
Add ability for llm to work with interactive processes
|
hai 11 meses |
Nicholas Carlini
|
e425c714aa
More tests, fixes to models
|
hai 11 meses |
Nicholas Carlini
|
c1a909f67b
Five new tests
|
hai 11 meses |
Nicholas Carlini
|
0737c24c6f
Split llms across files, a few new tests
|
hai 11 meses |
Nicholas Carlini
|
70e5ca5889
A bunch of tests
|
hai 11 meses |
Nicholas Carlini
|
af7c7c67f7
Add a bunch of evaluators, rewrite the eval interface
|
hai 11 meses |
Nicholas Carlini
|
f15c042cf2
Initial commit; framework skeleton
|
hai 1 ano |