mirror of
https://github.com/qodo-ai/pr-agent.git
synced 2025-12-12 02:45:18 +00:00
docs: add Gemini-2.5-flash-preview benchmark comparisons to PR benchmark table
This commit is contained in:
parent
3ddd53d4fe
commit
72bcb0ec4c
1 changed files with 13 additions and 0 deletions
|
|
@ -47,6 +47,18 @@ Here's a summary of the win rates based on the benchmark:
|
||||||
<td style="text-align:left;">Gemini-2.5-pro-preview-05-06</td>
|
<td style="text-align:left;">Gemini-2.5-pro-preview-05-06</td>
|
||||||
<td style="text-align:left;">Sonnet 3.7</td>
|
<td style="text-align:left;">Sonnet 3.7</td>
|
||||||
<td style="text-align:center; color: #1E8449;"><b>78.1%</b></td> <td style="text-align:center; color: #D8000C;"><b>21.9%</b></td> </tr>
|
<td style="text-align:center; color: #1E8449;"><b>78.1%</b></td> <td style="text-align:center; color: #D8000C;"><b>21.9%</b></td> </tr>
|
||||||
|
<tr>
|
||||||
|
<td style="text-align:left;">Gemini-2.5-pro-preview-05-06</td>
|
||||||
|
<td style="text-align:left;">Gemini-2.5-flash-preview-04-17</td>
|
||||||
|
<td style="text-align:center; color: #1E8449;"><b>73.0%</b></td> <td style="text-align:center; color: #D8000C;"><b>27.0%</b></td> </tr>
|
||||||
|
<tr>
|
||||||
|
<td style="text-align:left;">Gemini-2.5-flash-preview-04-17</td>
|
||||||
|
<td style="text-align:left;">GPT-4.1</td>
|
||||||
|
<td style="text-align:center; color: #1E8449;"><b>54.6%</b></td> <td style="text-align:center; color: #D8000C;"><b>45.4%</b></td> </tr>
|
||||||
|
<tr>
|
||||||
|
<td style="text-align:left;">Gemini-2.5-flash-preview-04-17</td>
|
||||||
|
<td style="text-align:left;">Sonnet 3.7</td>
|
||||||
|
<td style="text-align:center; color: #1E8449;"><b>60.6%</b></td> <td style="text-align:center; color: #D8000C;"><b>39.4%</b></td> </tr>
|
||||||
<tr>
|
<tr>
|
||||||
<td style="text-align:left;">GPT-4.1</td>
|
<td style="text-align:left;">GPT-4.1</td>
|
||||||
<td style="text-align:left;">Sonnet 3.7</td>
|
<td style="text-align:left;">Sonnet 3.7</td>
|
||||||
|
|
@ -54,6 +66,7 @@ Here's a summary of the win rates based on the benchmark:
|
||||||
</tbody>
|
</tbody>
|
||||||
</table>
|
</table>
|
||||||
|
|
||||||
|
|
||||||
## Gemini-2.5-pro-preview-05-06 - Model Card
|
## Gemini-2.5-pro-preview-05-06 - Model Card
|
||||||
|
|
||||||
### Comparison against GPT-4.1
|
### Comparison against GPT-4.1
|
||||||
|
|
|
||||||
Loading…
Reference in a new issue