Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

harheem
Copy link
Contributor

@harheem harheem commented Jan 21, 2025

Author Checklist

  • PR Title Format: I have confirmed that the PR title follows the correct format. (e.g., [N-2] 07-Text Splitter / 07-RecursiveCharacterTextSplitter)

  • Committed Files: I have ensured that no unnecessary files (e.g., .bin, .gitignore, poetry.lock, pyproject.toml) are included. These files are not allowed.

  • (Optional) Related Issue: If this PR is linked to an issue, I have referenced the issue number in the PR message. (e.g., Fixes Update 01-PromptTemplate.ipynb #123)

  • ❌ Do not include unnecessary files (e.g., .bin, .gitignore, poetry.lock, pyproject.toml) or other people's code. If included, close the PR and create a new PR.

Review Template (Intial PR)

🖥️ OS: Win/Mac/Linux   
✅ Checklist      
 - [ ] **Template**: Tutorials follows the required template. 
 - [ ] **Table of Contents(TOC) Links**: All Table of Contents links work. (Yes/No)
 - [ ] **Image**: Image filenames follow guidelines.
 - [ ] **Imports**: All import statements use the latest versions. Ensure "langchain-teddynote" is not used. 
 - [ ] **Code Execution**: Code runs without errors.
 - Comments: {Write freely, 한국어 기술 가능}     

If no one reviews your PR within a few days, please @-mention one of teddylee777, musangk, BAEM1N

@harheem harheem self-assigned this Jan 21, 2025
@harheem harheem requested review from sungchul2 and Secludor January 21, 2025 15:31
@chaeyoonyunakim chaeyoonyunakim self-requested a review January 21, 2025 23:42
@chaeyoonyunakim chaeyoonyunakim added docs tutorial proofreading 번역/검수팀 제안사항 반영 labels Jan 21, 2025
Copy link
Contributor

@sungchul2 sungchul2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🖥️ OS: Linux
✅ Checklist

  • Template: Tutorials follows the required template.
  • Table of Contents(TOC) Links: All Table of Contents links work. (Yes/No)
  • Image: Image filenames follow guidelines.
  • Imports: All import statements use the latest versions. Ensure "langchain-teddynote" is not used.
  • Code Execution: Code runs without errors.
  • Comments: {Write freely, 한국어 기술 가능}
    • 긴 튜토리얼을 작업하시느라 고생많으셨습니다 😄
    • 몇 가지 comment를 남겨두었는데 확인해주시면 감사하겠습니다.

@Secludor
Copy link
Contributor

🖥️ OS: Win
✅ Checklist

  • Template: Tutorials follows the required template.
  • Table of Contents(TOC) Links: All Table of Contents links work. ((Yes/No)
  • Image: Image filenames follow guidelines.(N/A)
  • *Imports: All import statements use the latest versions. Ensure "langchain-teddynote" is not used.
  • Code Execution: Code runs without errors.
  • Comments: {Write freely, 한국어 기술 가능}
    • output = run_graph( super_graph, ~~) 실행이 불안정한지 150회 도달 후에도 종료조건을 만족하지 못했다는 오류가 뜨긴 했는데 다시 실행했을 때는 정상 동작 했습니다! 고생하셨습니다 :)

@harheem
Copy link
Contributor Author

harheem commented Jan 25, 2025

@rlatjcj @Secludor 리뷰 및 코멘트 정말 감사합니다! ☺️ 아래와 같이 변경사항을 정리해보았습니다. 긴 튜토리얼 꼼꼼히 리뷰해주셔서 다시 한번 감사드립니다 ㅎㅎ

[변경 사항]

  • 프롬프트를 개선하여 결과물이 보장되도록 하였습니다.
  • langchain_opentutorial 라이브러리를 활용하여 run_graph, visualize_graph를 사용하도록 변경하였습니다.
  • beautifulsoup4 패키지를 추가하였습니다.
  • 무의미한 반복 작업이 확인되어 'gpt-4o-mini'로 모델을 변경하였더니, 이 점이 개선되었습니다.

Secludor
Secludor previously approved these changes Jan 25, 2025
Copy link
Contributor

@Secludor Secludor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🖥️ OS: Win
✅ Checklist

  • Template: Tutorials follows the required template.
  • Table of Contents(TOC) Links: All Table of Contents links work. ((Yes/No)
  • Image: Image filenames follow guidelines.(N/A)
  • *Imports: All import statements use the latest versions. Ensure "langchain-teddynote" is not used.
  • Code Execution: Code runs without errors.
  • Comments: {Write freely, 한국어 기술 가능}
    • 고생하셨습니다!! 새해 복 많이 받으셔요 :)

@sungchul2
Copy link
Contributor

sungchul2 commented Jan 25, 2025

#490 (comment)
이 부분 개행이 많은건 왜그런걸까요???


제가 캡쳐한 yahoo finance도 그렇고 아래 arxiv도 비슷한 모습을 보이는 걸 보니 web scrapping에서 html이 그렇게 구성되어 있어서 그런 듯 하네요

@sungchul2
Copy link
Contributor

sungchul2 commented Jan 31, 2025

#490 (comment) 이 부분 개행이 많은건 왜그런걸까요???

제가 캡쳐한 yahoo finance도 그렇고 아래 arxiv도 비슷한 모습을 보이는 걸 보니 web scrapping에서 html이 그렇게 구성되어 있어서 그런 듯 하네요

@harheem 제 생각엔 이런 로그가 gitbook에 그대로 올라가는 건 좋지 않을거 같습니다. (@sooyoung-wind 한 번 확인해주시겠어요???)
이 이슈가 tools in [WebScraper] 로그에서 나오고 있는데, WebScraper의 prompt를 통해 개행을 없앤다던지, invoke_graph 함수에 callback으로 tools in [WebScraper] 의 로그만 수정한다던지 등의 방법을 사용할 수 있을 것 같습니다!

@harheem
Copy link
Contributor Author

harheem commented Feb 1, 2025

@rlatjcj 좋은 피드백 정말 감사드립니다 ㅎㅎ 스크래핑된 아웃풋이 깔끔하게 나올 수 있도록 코드를 수정하였습니다.

# Define tool for scraping detailed information from web pages
@tool
def scrape_webpages(urls: List[str]) -> str:
    """Scrape web pages and clean unnecessary whitespace."""
    loader = WebBaseLoader(
        web_path=urls,
        header_template={
            "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/102.0.0.0 Safari/537.36",
        },
    )
    
    docs = loader.load()
    
    def clean_text(html: str) -> str:
        soup = BeautifulSoup(html, "html.parser")
        text = soup.get_text(separator=" ").strip()
        return re.sub(r'\s+', ' ', text)  # Remove excessive whitespace
    
    return "\n\n".join(
        [f'<Document name="{doc.metadata.get("title", "").strip()}">\n{clean_text(doc.page_content)}\n</Document>' for doc in docs]
    )
image

sungchul2

This comment was marked as outdated.

Copy link
Contributor

@sungchul2 sungchul2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🖥️ OS: Linux
✅ Checklist

  • Template: Tutorials follows the required template.
  • Table of Contents(TOC) Links: All Table of Contents links work. (Yes/No)
  • Image: Image filenames follow guidelines.
  • Imports: All import statements use the latest versions. Ensure "langchain-teddynote" is not used.
  • Code Execution: Code runs without errors.
  • Comments: {Write freely, 한국어 기술 가능}
    • 고생많으셨습니다! 근데 지금 main 브랜치에는 깔끔하게 된 output이 반영되지 않았는데, 전체 rerun 돌리시고 파일을 올려주시겠어요???
      image

@sungchul2 sungchul2 self-requested a review February 2, 2025 12:41
@Secludor
Copy link
Contributor

Secludor commented Feb 3, 2025

🖥️ OS: Win
✅ Checklist

  • Template: Tutorials follows the required template.
  • Table of Contents(TOC) Links: All Table of Contents links work. (Yes/No)
  • Image: Image filenames follow guidelines.
  • Imports: All import statements use the latest versions. Ensure "langchain-teddynote" is not used.
  • Code Execution: Code runs without errors.
  • Comments: {Write freely, 한국어 기술 가능}
    • 윈도우에서도 모두 정상 동작하며, 직접 실행 시 개행 없이 출력되는 부분 확인 했습니다! 전체 rerun 부분만 반영해주시면 될 것 같습니다!

@harheem
Copy link
Contributor Author

harheem commented Feb 4, 2025

@Secludor @rlatjcj 이번 PR이 많이 부족했네요. 번거롭게 해드려 정말 죄송하면서, 세심한 리뷰와 큰 도움 정말 감사합니다!
부디 이번이 진정한 최종 요청이 되길 바랍니다. 😂

Copy link
Contributor

@sungchul2 sungchul2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🖥️ OS: Linux
✅ Checklist

  • Template: Tutorials follows the required template.
  • Table of Contents(TOC) Links: All Table of Contents links work. (Yes/No)
  • Image: Image filenames follow guidelines.
  • Imports: All import statements use the latest versions. Ensure "langchain-teddynote" is not used.
  • Code Execution: Code runs without errors.
  • Comments: {Write freely, 한국어 기술 가능}
    • 고생많으셨습니다!

@Secludor
Copy link
Contributor

Secludor commented Feb 5, 2025

🖥️ OS: Win
✅ Checklist

  • Template: Tutorials follows the required template.
  • Table of Contents(TOC) Links: All Table of Contents links work. (Yes/No)
  • Image: Image filenames follow guidelines.
  • Imports: All import statements use the latest versions. Ensure "langchain-teddynote" is not used.
  • Code Execution: Code runs without errors.
  • Comments: {Write freely, 한국어 기술 가능}
    • 고생하셨습니다 :)

@Hye-yoonJeong Hye-yoonJeong mentioned this pull request Feb 6, 2025
3 tasks
@chaeyoonyunakim chaeyoonyunakim removed the proofreading 번역/검수팀 제안사항 반영 label Feb 7, 2025
Copy link
Contributor

@chaeyoonyunakim chaeyoonyunakim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

번역/검수 확인하였습니다.

@teddylee777 teddylee777 merged commit 1ba483e into LangChain-OpenTutorial:main Feb 7, 2025
3 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs tutorial
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants