david
|
e0313273ca
fix for openai chatgpt config (for evaluation)
|
hace 8 meses |
david
|
0f69e5b407
benchmark for bagel-dpo-34b-v0.2-AWQ
|
hace 8 meses |
Nicholas Carlini
|
47cf703653
Add new LLM runs
|
hace 9 meses |
Nicholas Carlini
|
5b7c71d4c4
Add try/catch to stop failures
|
hace 9 meses |
Nicholas Carlini
|
06a4030533
Merge pull request #11 from grantmwilliams/main
|
hace 9 meses |
Nicholas Carlini
|
4637f697f0
Merge pull request #7 from daulet/daulet/awsv6
|
hace 9 meses |
Grant Williams
|
b1cfa13350
update RustRun docstring
|
hace 9 meses |
Nicholas Carlini
|
e5a1af4253
Merge pull request #4 from lychees/main
|
hace 9 meses |
Nicholas Carlini
|
a6016c475c
Merge pull request #3 from alexisgauba/main
|
hace 9 meses |
Nicholas Carlini
|
e87ba13181
Update tests/git_merge.py
|
hace 9 meses |
Daulet Zhanguzin
|
13f9264734
improve AWSV6 test eval
|
hace 9 meses |
Nicholas Carlini
|
21849004dc
Add more readme text
|
hace 9 meses |
Nicholas Carlini
|
f080b97655
Edit README to say you probably shouldn't use this in a paper
|
hace 9 meses |
Nicholas Carlini
|
1e92067b21
Fix #6, gitignore_anywhere.py has incorrect question framed
|
hace 9 meses |
Your Name
|
ba22b7c70a
remove dead code
|
hace 9 meses |
Your Name
|
5aa4872242
consolidate to one file
|
hace 9 meses |
Your Name
|
e3f9ffa463
update merge test
|
hace 9 meses |
minakokojima
|
776be32d35
Add moonshot model
|
hace 9 meses |
Your Name
|
ec0708754b
fix commit message titles
|
hace 9 meses |
Your Name
|
61e46723bb
fix merge conflict test
|
hace 9 meses |
Your Name
|
4202c73faf
tests: merge & merge conflict
|
hace 9 meses |
Nicholas Carlini
|
baec772990
Fix typo in JSON object conversion command; #1 from Evanc123/patch-1
|
hace 9 meses |
Nicholas Carlini
|
8aac3b7424
Add colab notebook demo; #2 from ViswanathaReddyGajjala/main
|
hace 9 meses |
Viswa
|
83bc04bd04
Update README.md
|
hace 9 meses |
Viswa
|
c09ecb6e9d
update python env varaible
|
hace 9 meses |
Viswa
|
b9f105d5ca
add ipynb notebook to run on colab
|
hace 9 meses |
ViswanathaReddyGajjala
|
2ca8cbaa77
minor fix to successfully run a single test case
|
hace 9 meses |
Evan Cater
|
da9d53df54
Update evaluator.py
|
hace 9 meses |
Nicholas Carlini
|
4d19d482d6
Don't spend 3x queries
|
hace 9 meses |
Nicholas Carlini
|
61df499e1c
A few readme tweask
|
hace 10 meses |