Compare commits

...

7 commits

Author SHA1 Message Date
qodo-merge-for-open-source[bot]
3b3a1cd762
Merge cd68a92283 into dcab8a893c 2025-10-14 14:54:31 -07:00
ofir-frd
dcab8a893c
Merge pull request #2068 from qodo-ai/of/toggleable-compliance-sections-docs
Some checks failed
Build-and-test / build-and-test (push) Has been cancelled
docs-ci / deploy (push) Has been cancelled
docs: add section visibility configuration options to compliance tool
2025-10-14 21:18:14 +03:00
ofir-frd
2230da73fd docs: clarify description of codebase duplication compliance section in documentation 2025-10-14 21:04:03 +03:00
ofir-frd
f251a8a6cb docs: add section visibility configuration options to compliance tool documentation 2025-10-14 20:56:43 +03:00
ofir-frd
9fd28e5680
Merge pull request #2065 from qodo-ai/of/claude-sonnet-4-5-pr_benchmark
docs: add Claude Sonnet 4.5 benchmark
2025-10-13 13:28:53 +03:00
ofir-frd
d0bbc56480 docs: add Claude Sonnet 4.5 benchmark results to PR benchmark documentation 2025-10-13 13:24:33 +03:00
qodo-merge-pro-for-open-source[bot]
cd68a92283
Generated best practices file 2025-04-16 10:18:11 +00:00
3 changed files with 246 additions and 2 deletions

199
best_practices.md Normal file
View file

@ -0,0 +1,199 @@
<b>Pattern 1: Wrap critical operations with try-except blocks to handle potential exceptions, especially for file operations, API calls, and data parsing functions.</b>
Example code before:
```
def get_git_repo_url(self, issues_or_pr_url: str) -> str:
repo_path = self._get_owner_and_repo_path(issues_or_pr_url)
if not repo_path or repo_path not in issues_or_pr_url:
get_logger().error(f"Unable to retrieve owner/path from url: {issues_or_pr_url}")
return ""
return f"{issues_or_pr_url.split(repo_path)[0]}{repo_path}.git"
```
Example code after:
```
def get_git_repo_url(self, issues_or_pr_url: str) -> str:
try:
repo_path = self._get_owner_and_repo_path(issues_or_pr_url)
if not repo_path or repo_path not in issues_or_pr_url:
get_logger().error(f"Unable to retrieve owner/path from url: {issues_or_pr_url}")
return ""
return f"{issues_or_pr_url.split(repo_path)[0]}{repo_path}.git"
except Exception as e:
get_logger().error(f"Failed to get git repo url from {issues_or_pr_url}, error: {e}")
return ""
```
<details><summary>Examples for relevant past discussions:</summary>
- https://github.com/qodo-ai/pr-agent/pull/1644#discussion_r2013912636
- https://github.com/qodo-ai/pr-agent/pull/1263#discussion_r1782129216
- https://github.com/qodo-ai/pr-agent/pull/1391#discussion_r1879870807
</details>
___
<b>Pattern 2: Use proper logging methods instead of print statements, with get_logger().error() for errors, get_logger().warning() for warnings, and get_logger().info() for informational messages.</b>
Example code before:
```
if isinstance(response_tuple, tuple) and len(response_tuple) == 3:
response_json = json.loads(response_tuple[2])
else:
print("Unexpected response format:", response_tuple)
return sub_issues
```
Example code after:
```
if isinstance(response_tuple, tuple) and len(response_tuple) == 3:
response_json = json.loads(response_tuple[2])
else:
get_logger().error(f"Unexpected response format", artifact={"response": response_tuple})
return sub_issues
```
<details><summary>Examples for relevant past discussions:</summary>
- https://github.com/qodo-ai/pr-agent/pull/1529#discussion_r1958684550
- https://github.com/qodo-ai/pr-agent/pull/1529#discussion_r1958686068
- https://github.com/qodo-ai/pr-agent/pull/1529#discussion_r1964110734
- https://github.com/qodo-ai/pr-agent/pull/1634#discussion_r2007976915
</details>
___
<b>Pattern 3: Move specific imports to where they are actually used rather than at the top of the file, especially for rarely used or heavy dependencies.</b>
Example code before:
```
import os
from azure.identity import ClientSecretCredential
import litellm
import openai
import requests
```
Example code after:
```
import os
import litellm
import openai
import requests
# Later in the code where Azure AD is actually used:
if get_settings().get("AZURE_AD.CLIENT_ID", None):
from azure.identity import ClientSecretCredential
# Azure AD specific code...
```
<details><summary>Examples for relevant past discussions:</summary>
- https://github.com/qodo-ai/pr-agent/pull/1698#discussion_r2046221654
</details>
___
<b>Pattern 4: Add defensive checks for potentially None or invalid values before performing operations on them, especially when working with external data or API responses.</b>
Example code before:
```
model_is_from_o_series = re.match(r"^o[1-9](-mini|-preview)?$", model)
if ('gpt' in get_settings().config.model.lower() or model_is_from_o_series) and get_settings().get('openai.key'):
return encoder_estimate
```
Example code after:
```
if model is None:
get_logger().warning("Model is None, cannot determine model type accurately")
return encoder_estimate
if not isinstance(model, str):
get_logger().warning(f"Model is not a string type: {type(model)}")
return encoder_estimate
model_is_from_o_series = re.match(r"^o[1-9](-mini|-preview)?$", model)
openai_key_exists = get_settings().get('openai.key') is not None
if (('gpt' in model.lower() or model_is_from_o_series) and openai_key_exists):
return encoder_estimate
```
<details><summary>Examples for relevant past discussions:</summary>
- https://github.com/qodo-ai/pr-agent/pull/1644#discussion_r2032621065
- https://github.com/qodo-ai/pr-agent/pull/1529#discussion_r1958694146
- https://github.com/qodo-ai/pr-agent/pull/1391#discussion_r1879875496
</details>
___
<b>Pattern 5: Avoid redundant code initialization and reuse existing objects or instances when possible, especially for resource-intensive operations.</b>
Example code before:
```
if tickets:
provider = GithubProvider()
for ticket in tickets:
# Extract sub-issues
sub_issues_content = []
try:
sub_issues = provider.fetch_sub_issues(ticket)
```
Example code after:
```
if tickets:
for ticket in tickets:
# Extract sub-issues
sub_issues_content = []
try:
sub_issues = git_provider.fetch_sub_issues(ticket)
```
<details><summary>Examples for relevant past discussions:</summary>
- https://github.com/qodo-ai/pr-agent/pull/1529#discussion_r1964085987
- https://github.com/qodo-ai/pr-agent/pull/1529#discussion_r1964088304
</details>
___
<b>Pattern 6: Use descriptive variable names and add explanatory comments for complex logic or non-obvious code to improve maintainability and readability.</b>
Example code before:
```
issues = value
for i, issue in enumerate(issues):
try:
if not issue or not isinstance(issue, dict):
continue
```
Example code after:
```
focus_areas = value
for i, focus_area in enumerate(focus_areas):
try:
# Skip empty issues or non-dictionary items to ensure valid data structure
if not focus_area or not isinstance(focus_area, dict):
continue
```
<details><summary>Examples for relevant past discussions:</summary>
- https://github.com/qodo-ai/pr-agent/pull/1262#discussion_r1782097201
- https://github.com/qodo-ai/pr-agent/pull/1262#discussion_r1782097204
- https://github.com/qodo-ai/pr-agent/pull/1583#discussion_r1971790979
</details>
___

View file

@ -82,6 +82,12 @@ A list of the models used for generating the baseline suggestions, and example r
<td style="text-align:left;">unknown</td>
<td style="text-align:center;"><b>41.7</b></td>
</tr>
<tr>
<td style="text-align:left;">Claude-sonnet-4-5</td>
<td style="text-align:left;">2025-09-29</td>
<td style="text-align:left;"></td>
<td style="text-align:center;"><b>40.7</b></td>
</tr>
<tr>
<td style="text-align:left;">Claude-4-sonnet</td>
<td style="text-align:left;">2025-05-14</td>
@ -182,6 +188,24 @@ weaknesses:
- **False positives / speculative fixes:** In several cases it flags non-issues (style, performance, redundant code) or supplies debatable “improvements”, lowering precision and sometimes breaching the “critical bugs only” rule.
- **Inconsistent error coverage:** For certain domains (build scripts, schema files, test code) it either returns an empty list when real regressions exist or proposes cosmetic edits, indicating gaps in specialised knowledge.
### Claude-sonnet-4-5
Final score: **40.7**
strengths:
- **Concise & well-formatted output:** Most replies strictly follow the schema, stay within the 3-suggestion limit, and include clear, copy-paste-ready patches, making them easy to apply.
- **Can spot headline bugs:** When a single, obvious regression is present (e.g. duplicated regex block, missing null-check, wrong macro name) the model often detects it and proposes an accurate, minimal fix.
- **Scope discipline (usually):** It frequently restricts changes to newly-added lines and avoids broad refactors, so many answers comply with the “new code only / critical bugs only” rule.
- **Reasonable explanations:** The accompanying rationales are typically short but precise, helping reviewers understand why the change is needed.
weaknesses:
- **Low recall of critical issues:** In a large fraction of examples the model misses the primary bug or flags nothing at all while other reviewers find clear problems. Coverage is therefore unreliable.
- **False or harmful fixes:** A notable number of suggestions mis-diagnose the code, touch unchanged lines, violate task rules, or would break compilation/runtime (wrong paths, bad types, guideline-forbidden advice).
- **Priority mistakes:** The model often downgrades severe defects to “general” or upgrades cosmetic nits to “critical”, showing weak bug-severity judgment.
- **Inconsistent quality:** Performance swings widely between excellent and poor; reviewers cannot predict whether a given answer will be thorough, partial, or incorrect.
### Claude-4 Sonnet (4096 thinking tokens)
Final score: **39.7**
@ -195,9 +219,9 @@ strengths:
weaknesses:
- **Low coverage / recall:** Regularly surfaces only one minor issue (or none) while missing other, often more severe, problems caught by peer models.
- **High “empty-list” rate:** In many diffs the model returns no suggestions even when clear critical bugs exist, offering zero reviewer value.
- **High "empty-list" rate:** In many diffs the model returns no suggestions even when clear critical bugs exist, offering zero reviewer value.
- **Occasional incorrect or harmful fixes:** A non-trivial number of suggestions are speculative, contradict code intent, or would break compilation/runtime; sometimes duplicates or contradicts itself.
- **Inconsistent severity labelling & duplication:** Repeats the same point in multiple slots, marks cosmetic edits as “critical”, or leaves `improved_code` identical to original.
- **Inconsistent severity labelling & duplication:** Repeats the same point in multiple slots, marks cosmetic edits as "critical", or leaves `improved_code` identical to original.
### Claude-4 Sonnet

View file

@ -292,6 +292,27 @@ enable_global_pr_compliance = true
</tr>
</table>
???+ example "Section visibility options"
<table>
<tr>
<td><b>enable_security_section</b></td>
<td>If set to true, the security compliance section will be displayed in the output. When false, the entire security section is hidden. Default is true.</td>
</tr>
<tr>
<td><b>enable_ticket_section</b></td>
<td>If set to true, the ticket compliance section will be displayed in the output. When false, the entire ticket section is hidden. Default is true.</td>
</tr>
<tr>
<td><b>enable_codebase_duplication_section</b></td>
<td>If set to true, the codebase duplication compliance section will be displayed in the output. When false, the entire codebase duplication section is hidden. Default is true.</td>
</tr>
<tr>
<td><b>enable_custom_compliance_section</b></td>
<td>If set to true, the custom compliance section will be displayed in the output. When false, the entire custom section is hidden. Default is true.</td>
</tr>
</table>
???+ example "Security compliance options"
<table>