2025-01-07 - DONGLE

<< [[2025-01-06|Before]] | [[2025-01-10|Next]] >> ## Goal - `fileSelct Agent`의 결과 값이 정확하고 일정하게 나오도록 조정하기 - 특정 디렉토리 안에 있는 모든 이미지 파일 찾기 - 질문에 맞는 파일과 관련된 모든 이미지 찾기 - 관련된 모든 파일과 이미지들을 활용하여 질문에 대한 답 만들기 ## TODO > [!TODO] TODO > - [v] 특정 디렉토리 안에 있는 이미지 파일 찾기 > - [v] fileSelect로 핵심 문서와 관련된 이미지 정리하기 > - [v] 이미지 파일을 읽고 Text로 추출하기 > - [v] fileSelect 결과 값 일정하게 만들기 > - [-] fileSelect 결과 파일을 모두 읽고 질문에 대답하기 ## Issue/Solution #### Issue 1 > [!blank] > >> [!warning]+ Problem >> `crewAI`의 `Vision Tool`을 사용하여 해당 `file`에 있는 `image`를 `text`로 변환하는 과정을 수행하려 하였다. 문제는 `Vision Tool`이 올바르게 작동하지 않는다는 것이다. > >> [!summary]+ Solution >> `Vision Tool`에 대해 조사해본 결과 `Vision Tool`은 `svg` 형식을 지원하지 않는다는 것을 알아냈다. >> 내가 정리한 `obsidian`의 `images`은 모두 `svg`이기 때문에 `Vision Tool`이 해당 `Images`를 불러오는 과정에 오류가 발생했다. >> 이를 해결하기 위해서 `obsidian`의 `Excalidraw` 플러그인의 자동 변환 파일을 `png`로 설정하여 모든 `file`을 `png`로 변환하고 `Vision Tool`을 사용했다. >> 하지만 나중에 프로젝트를 배포할 때에는 사용자에게 `Image`를 선택할 수 있게 또는 `Image` 형식을 변환 할 수 있는 기능을 구현하면 좋을 것 같다. #### Issue 2 ([[2025-01-06#Issue 3|Before Issue]]) > [!blank] > >> [!warning]+ Problem >>`fileSelct Agent`의 결과 값이 정확도도 떨어지고 일정하게 나오지 않는 문제가 있었다. > >> [!summary]+ Solution >> 정확도를 올리기 위해 `Agent`와 `Task`의 `Prompt`를 계속 수정하였지만, 만족할만한 결과가 나오진 않았다. >> 따라서 정확도를 올리기 위한 다른 방안을 마련해야 할 것 같다. **pydantic**을 활용하여 `output` 형식을 지정하여 일정한 결과 값이 출력 되도록 구현하였다. #### Issue 3 > [!blank] > >> [!warning]+ Problem >>`textOrganizer Agent`을 사용하여 각각의 `markdown` 파일을 읽고 분석하여 출력하도록 만들었다. >>문제는 이러한 과정이 너무나 오래 걸린다는 것이다. 심지어 재대로 작동하지도 않는다. > >> [!summary]+ Solution >>이를 해결하기 위해 `CrewAI`는 `image`를 만 이용하고 `LangChian`으로 `path`를 읽고 분석하여 하나의 결과 값으로 만들도록 해야 할 것 같다. ## Reference - [[Agent#Pydantic|Pydantic]] - [Vision Tool](https://docs.crewai.com/tools/visiontool#visiontool) ## Git commit contents - [build : imgPathSearcher Agent and imgPathSearch Task](https://github.com/Donghyeon-Shin/DocumentSecretary/commit/8a61a0a74d6453c23f5bd2ed8636b8ebac76a26c "build : imgPathSearcher Agent and imgPathSearch Task") - [docs : package list 수정(crewAI 업그레이드)](https://github.com/Donghyeon-Shin/DocumentSecretary/commit/b2bac147b3936eb6cd6c0a3a3d5ac1541d91f8cc "docs : package list 수정(crewAI 업그레이드)") - [build : imgExtracter Agent and imgExtract Task](https://github.com/Donghyeon-Shin/DocumentSecretary/commit/d470ab7539b41e70c0f9e9b80834566c23faef4e "build : imgExtracter Agent and imgExtract Task") - [fix : fileSelect expected_output 내용 수정](https://github.com/Donghyeon-Shin/DocumentSecretary/commit/52e3d6ffc0ab7bf375a52c13c521180daa27a89c "fix : fileSelect expected_output 내용 수정") - [fix : fileSelect expected_output Json 형식으로 출력되도록 수정](https://github.com/Donghyeon-Shin/DocumentSecretary/commit/c47c43530888c8c1a562d5ed64fa1ac0e0364c24 "fix : fileSelect expected_output Json 형식으로 출력되도록 수정정") - [build : textOrganizer Agent and textOrganize Task](https://github.com/Donghyeon-Shin/DocumentSecretary/commit/e36ae94baa2cfe7e0a80e33511edbfa8f72a0fc6 "build : textOrganizer Agent and textOrganize Task") ## What should I do more - `CrewAI` 기능 일부를 `LangChain`으로 구현하기 - 질문을 받고 결과 값을 반환하기