Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Fix: improve file filtering, fix yaml parsing, add openrouter support. #71

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 0 commits into from

Conversation

redreamality
Copy link
Contributor

relpath = os.path.relpath(filepath, directory)
else:
relpath = filepath
rel_root = os.path.relpath(root, directory) if use_relative_paths else root
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should have already been addressed by another PR?
#67
Could you check if that PR made such optimization?

@@ -0,0 +1,20 @@
import re

def add_indentation(text):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to not fix indentation as a post processing.
If an LLM failed with outputting yaml, it could be caused by many reasons. I'm bit worried that fixing problematic one will still get a result that's off in contnet. I would just let LLM retry.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that fixing indent for yaml is a temporal solution. In fact, this need occurs from use case for large repos. A new issue to address this might be more suitable.

FYI.
I tested it on some inside projects, on deepseek, gemini, qwen, and claude.
when total tokens below 100k, things usually work well.
when ~700k tokens, for gemini-2.5, sometimes it successes after retry; for gemini-2.0, it nearly fails all the time (at same point with indent problem).

nodes.py Outdated
@@ -117,7 +121,7 @@ def exec(self, prep_res):
{context}

{language_instruction}Analyze the codebase context.
Identify the top 5-10 core most important abstractions to help those new to the codebase.
Identify the top 5-20 core most important abstractions to help those new to the codebase.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you find 20 is a good change? I tune this before and find abstractions above 10 become less interesting and a bit low level to read. Maybe make this a tunable parameters?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure

@zachary62
Copy link
Member

Improved the speed of file filtering in crawl_local_files.py with folder-level exclusion, partially solve #23

Love it! Could you check if it has been implemented by previous PR?

Added fix_yaml.py utility for YAML indentation fixes

I'd rather just let LLM retry. My experience is that, when LLM messes up the indentation, it usually also messes up the content a bit.

Updated nodes.py to support up to 20 core abstractions #23

Could you make it tunable?

add option for no cache.
add openrouter support #51

Love them!

@redreamality
Copy link
Contributor Author

now addressed in #74

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants