'ollama' 태그의 글 목록

ollama

Ollama gemma3 4B 모델로 vision & Image 분석해보기 2025.03.16
[eKYC] Colab LLM & OCR with granite3.2-vision 2025.02.24
Google Colab에서 Ollama를 사용하여 Gemma 모델을 실행 2024.09.22
Mac Intel CPU Ollama gemma2 2024.09.06 1
M1 Ollama로 heegyu/EEVE-Korean-Instruct-10.8B-v1.0-GGUF 양자화 모델 테스트 2024.04.20 1
M1 Ollama로 경량화 한글모델 eeve GGUF 모델 설치 및 LangChain 2024.04.12
M1 Ollama+Gemma+LangChain 설치하고 간단히 돌려보기 2024.04.10

Ollama gemma3 4B 모델로 vision & Image 분석해보기

a_mnesia 2025. 3. 16. 20:05

2025. 3. 16. 20:05

728x90

이번 수요일(2025년 3월 12일)에 발표되었습니다. 1B, 4B, 12B, 27B의 4가지 크기로 출시되었으며, 사전 학습된 버전과 지침 미세 조정된 버전이 있습니다. 모델 4B, 12B, 27B는 확장된 컨텍스트 창(최대 128K)과 다중 모달리티를 제공합니다.

Macbook M1 + Ollama + gemma3 로 Multimodal 이미지 처리를 테스트해보겠습니다

Ollama Gemma3

Gemma is a lightweight, family of models from Google built on Gemini technology. The Gemma 3 models are multimodal—processing text and images—and feature a 128K context window with support for over 140 languages. Available in 1B, 4B, 12B, and 27B parameter sizes, they excel in tasks like question answering, summarization, and reasoning, while their compact design allows deployment on resource-limited devices.

Gemma는 Google이 Gemini 기술을 기반으로 개발한 경량 모델 계열입니다. Gemma 3 모델은 멀티모달 모델로, 텍스트와 이미지를 처리할 수 있으며 128K 컨텍스트 윈도우를 제공하고 140개 이상의 언어를 지원합니다. 1B, 4B, 12B, 27B 파라미터 크기로 제공되며, 질문 응답, 요약, 추론과 같은 작업에서 뛰어난 성능을 발휘합니다. 또한, 컴팩트한 설계로 인해 자원이 제한된 장치에서도 실행할 수 있습니다.

더욱 상세한 모델 설명을 보고자 한다면 "Gemma 3 기술 보고서 PDF"를 참고하세요.

Models

Text

- 1B parameter model (32k context window)

Multimodal (Vision)

- 4B parameter model (128k context window)

- 12B parameter model (128k context window)

- 27B parameter model (128k context window)

Default 모델은 4B 모델이고 1B는 text 이고 4B 사이즈부터 Multimodal 모델입니다. 4B 모델을 받습니다.

ollama run gemma3:4b

dongsik@dongsikleeui-MacBookPro % ollama run gemma3:4b
pulling manifest
pulling 377655e65351... 100% ▕█████████████████████████████████████████████████▏ 3.3 GB
pulling e0a42594d802... 100% ▕█████████████████████████████████████████████████▏  358 B
pulling dd084c7d92a3... 100% ▕█████████████████████████████████████████████████▏ 8.4 KB
pulling 0a74a8735bf3... 100% ▕█████████████████████████████████████████████████▏   55 B
pulling ffae984acbea... 100% ▕█████████████████████████████████████████████████▏  489 B
verifying sha256 digest
writing manifest
success
>>> Send a message (/? for help)

설치 확인

dongsik@dongsikleeui-MacBookPro % ollama --version
ollama version is 0.6.0
dongsik@dongsikleeui-MacBookPro % ollama list
NAME                        ID              SIZE      MODIFIED
gemma3:4b                   c0494fe00251    3.3 GB    41 minutes ago
dongsik@dongsikleeui-MacBookPro %

첫번째 이미지 분석입니다.

멋지고 쿨한 선글라스를 쓴 3마리 고양이입니다.

사진에서 어떤 설명을 하는지 먼저 볼까요? 우선 영어로 질문해보겠습니다 "Describe what you see in this picture"

안경을 쓴 멋진 고양이 3마리 - 출처: vecteezy 의 yganko

dongsik@dongsikleeui-MacBookPro % ollama run gemma3:4b
>>> Describe what you see in this picture vecteezy_three-cool-cats.jpg
Okay, I've analyzed the image vecteezy_three-cool-cats.jpg. Here's a description of what I see:

**Overall Impression:**

The image is a playful and vibrant digital illustration featuring three stylized cats.
It has a cartoonish, slightly retro feel, likely intended for a fun or whimsical design.

**Detailed Description:**

*   **Subject:** There are three cats, each with a distinct style.
    *   **Cat 1 (Left):** This cat is a bright, almost neon blue with a sleek, modern design.
        It has large, expressive eyes and a confident pose, looking directly at the viewer.   
    *   **Cat 2 (Center):** This cat is a vibrant pink with a more rounded, friendly appearance.
        It has a slightly mischievous expression and is playfully leaning forward.
    *   **Cat 3 (Right):** This cat is a bright yellow with a more geometric, almost robotic-looking 
        design. It has a slightly tilted head and a curious expression.

*   **Style:** The illustration is clearly digital and stylized. 
    The lines are clean and bold, and the colors are saturated. 
    It resembles a vector graphic, which means it can be scaled up or down without losing quality.

*   **Background:** The background is a solid, dark gray, which helps the bright colors of the cats 
    to pop.

*   **Composition:** The cats are arranged in a horizontal line, with the central cat slightly closer
    to the viewer.

*   **Overall Tone:** The image has a cheerful and energetic tone. It's designed to be appealing and fun.

**In essence, it's a cute and colorful digital illustration of three stylized cats, 
  likely intended for use in a graphic design project.**

Would you like me to focus on a specific aspect of the image, such as the style, color palette, 
or potential uses for this illustration?

>>> Send a message (/? for help)

Macbook M1에서 처리속도는 꽤 빠릅니다. 최초 질문에 응답하는데 20초 정도에 위 설명을 마쳤습니다.

>>> 이 그림에서 보이는 것을 한글로 설명하세요 vecteezy_three-cool-cats.jpg
네, vecteezy_three-cool-cats.jpg 그림에 보이는 것을 설명해 드리겠습니다.

이 그림은 세 마리의 귀여운 고양이 그림입니다. 각 고양이는 다른 색깔과 스타일을 가지고 있습니다.

*   **왼쪽 고양이:** 밝은 파란색으로, 세련되고 현대적인 느낌을 줍니다. 눈이 크고 자신감 있는 표정을 짓고 있습니다.
*   **중간 고양이:** 밝은 분홍색으로, 둥글고 친근한 느낌을 줍니다. 장난기 넘치는 표정으로 앞으로 기대어 있습니다.
*   **오른쪽 고양이:** 밝은 노란색으로, 기하학적인 디자인을 하고 있습니다. 머리를 살짝 기울여 호기심 있는 표정을 짓고 있습니다.

전체적으로 디지털 그림이며, 선이 깔끔하고 색상이 강렬하게 표현되어 있습니다. 배경은 어두운 회색으로, 고양이들의 색깔을 더욱 돋보이게 합니다.

전반적으로 귀엽고 활기찬 느낌을 주는 그림입니다.

혹시 그림의 특정 부분에 대해 더 자세한 설명을 원하시면 말씀해주세요.

>>> Send a message (/? for help)

오른쪽 고양이의 썬글라스 색을 물어봤는데 제눈에는 노란색으로 보입니다. gemma3는 밝은 금색(Gold) 라고 합니다.

아마 저색이 금색이겠죠?

>>> 이 그림에서 오른쪽 고양이의 썬글라스 색깔을 알려주세요 vecteezy_three-cool-cats.jpg
오른쪽 고양이의 썬글라스 색깔은 **금색 (Gold)** 입니다.

사진을 자세히 살펴보니, 썬글라스 렌즈와 프레임이 모두 밝은 금색으로 되어 있습니다.

>>> Send a message (/? for help)

>>> Please tell me the color of the sunglasses on the cat on the right in this picture. vecteezy_three-cool-cats.jpg
The sunglasses on the cat on the right are **gold**. They have a bright, metallic gold color.

>>>

이번엔 표가 있는 이미지를 가지고 테스트 해보겠습니다.

>>> Describe the content of the given image ~/gemma3_prerequisites.jpeg
Added image '/Users/dongsik/workspace/ekyc/gemma3_prerequisites.jpeg'
The image is a table outlining the recommended GPU requirements for different model sizes (in billions of parameters) for a Text-to-Text model.
Here’s a breakdown of the information presented:

*   **Parameters (Billion):** The table lists the model size in billions of parameters (1B, 4B, 12B, 27B).
*   **VRAM Requirement (Text-to-Text):** This column specifies the amount of VRAM (Video RAM) needed to run the model for Text-to-Text tasks.
*   **VRAM Requirement (Image-to-Text):** This column specifies the amount of VRAM needed to run the model for Image-to-Text tasks.
*   **Recommended GPU:** This column lists the recommended GPU based on the VRAM requirement.

Essentially, it’s a chart showing the relationship between model size, VRAM requirements, and the corresponding GPU needed to run the model
effectively.

Do you want me to elaborate on any specific aspect of this table, such as the VRAM requirements for a particular model size or the GPU
recommended for a specific task?

>>>

>>> 주어진 이미지의 내용을 설명하세요. /Users/dongsik/workspace/ekyc/gemma3_prerequisites.jpeg
Added image '/Users/dongsik/workspace/ekyc/gemma3_prerequisites.jpeg'
다음은 이미지의 내용에 대한 설명입니다.

**표는 텍스트-투-텍스트 및 이미지-투-텍스트 작업에 필요한 GPU 사양을 나타냅니다.**

*   **Parameters (Billions):** 모델의 파라미터 수를 나타냅니다.
*   **VRAM Requirement (Text-to-Text):** 텍스트-투-텍스트 작업에 필요한 VRAM(비디오 RAM)의 양을 나타냅니다.
*   **VRAM Requirement (Image-to-Text):** 이미지-투-텍스트 작업에 필요한 VRAM의 양을 나타냅니다.
*   **Recommended GPU:** 주어진 파라미터 수에 맞는 VRAM 요구 사항을 충족하는 데 권장되는 GPU 모델을 나열합니다.

**주요 관찰 사항:**

*   파라미터 수가 증가함에 따라 텍스트-투-텍스트 및 이미지-투-텍스트 작업 모두에 필요한 VRAM 요구 사항이 크게 증가합니다.
*   더 큰 모델은 효율적으로 작동하기 위해 훨씬 더 많은 VRAM이 필요합니다.

이 표가 도움이 되었기를 바랍니다!

>>>

그럼 마지막으로 이미지에서 답을 찾아오는 질문을 해보겠습니다,

"그림에서 4B 모델을 image to text 하기위해 필요한 VRAM은 은 얼마인가요?" 네, 답은 10.4B 입니다.

결과를 보시죠.

>>> 그림에서 4B 모델을 image to text 하기위해 필요한 VRAM은 은 얼마인가요?
그림에 따르면 4B 모델을 image to text 하기 위해 필요한 VRAM은 10.4 GB입니다.

>>>

오....!!!

728x90

'AI > LLM-Gemma' 카테고리의 다른 글

Google Colab에서 Ollama를 사용하여 Gemma 모델을 실행 (0)	2024.09.22
Colab에서 T4 GPU 런타임설정 및 Kaggle Credential Key 설정 (0)	2024.09.19
Mac Intel CPU Ollama gemma2 (1)	2024.09.06

[eKYC] Colab LLM & OCR with granite3.2-vision

a_mnesia 2025. 2. 24. 21:15

2025. 2. 24. 21:15

728x90

eKYC with Ollama & LLM Multi-modal Model OCR & DeepFace on Colab

개요

Google Colab에 Ollama 와 local LLM Vision 모델 및 DeepFace 라이브러리를 이용해서 eKYC 데모를 만들어 보겠습니다.

chatGPT와 같은 Public LLM 모델을 사용해도 되지만, 고객 신분증 및 사진과 같은 민감한 고객정보를 다루는 경우 Public LLM을 사용하기에는 제한(보안)이 따르기 때문에 local LLM으로 구현이 필요합니다.

eKYC란?

eKYC(Electronic Know Your Customer, 전자 고객 확인)는 고객의 신원을 디지털 방식으로 확인하는 자동화된 프로세스를 의미합니다. 전통적인 KYC(Know Your Customer) 프로세스가 물리적 문서(예: 신분증, 주소 증명)를 요구하고 대면 확인을 포함할 수 있는 반면, eKYC는 온라인 채널과 디지털 기술을 활용해 원격으로 신원 확인을 수행합니다. 이는 금융 기관, 통신사, 전자상거래 플랫폼 등 다양한 산업에서 사용되며, 자금 세탁 방지(AML) 및 테러 자금 조달 방지(CTF) 규정을 준수하는 데 필수적입니다.

주요 특징

1.디지털 신원 확인:

고객이 웹 포털이나 모바일 앱을 통해 신분증 사진, 셀카(얼굴 인식용) 등을 제출합니다.
OCR(광학 문자 인식) 기술로 문서 데이터를 추출하고, 생체 인식(예: 얼굴 인식)으로 본인 여부를 확인합니다.
정부 데이터베이스, 신뢰할 수 있는 제3자 데이터 소스(예: Aadhaar in India)와 연동해 정보를 검증합니다.

2. 전통적 KYC와의 차이점:

속도: eKYC는 몇 분 내로 신원 확인을 완료할 수 있으며, 전통적 KYC는 며칠에서 몇 주가 걸릴 수 있습니다.
편의성: 물리적 방문이나 문서 우송 없이 온라인으로 완료.
효율성: 자동화로 인적 오류 감소 및 비용 절감.
보안: 생체 인식, 일회용 비밀번호(OTP), 블록체인 기술로 보안 강화.

3. 주요 기술:

생체 인식: 얼굴, 지문, 음성 인식.
AI 및 머신러닝: 문서 위조 탐지, 위험 프로파일링.
블록체인: 데이터 보안 및 개인 정보 관리.
데이터베이스 연동: 공공 레지스트리, 제재 리스트, PEP(정치적 주요 인물) 목록 확인.

eKYC의 이점

고객 경험 개선: 빠르고 간편한 온보딩으로 고객 이탈률 감소.
비용 절감: 수동 처리 및 물리적 문서 관리 비용 감소.
규제 준수: AML, CTF, 데이터 보호법(GDPR, eIDAS 등) 준수.
사기 방지: 디지털 흔적(IP 주소, 타이핑 속도 등)과 생체 인식을 활용해 신원 도용 및 사기 위험 감소.
확장성: 대규모 고객 온보딩에 적합.

eKYC의 예시

인도: Aadhaar 기반 eKYC로 은행 계좌 개설 및 SIM 카드 등록이 몇 분 내 완료. 인도 성인의 99.9%가 디지털 ID를 보유.
은행: 고객이 모바일 앱으로 신분증과 셀카를 업로드해 계좌 개설.
통신사: SIM 카드 활성화 시 eKYC로 신원 확인.
헬스케어: 환자 신원 확인 및 의료 기록 정확성 보장.

한계 및 과제

데이터 보안: 민감한 개인 정보(생체 데이터, 신분증 정보) 유출 위험.
오류 가능성: 생체 인식의 오탐지(false positive/negative) 가능성.
규제 차이: 국가별 eKYC 규정 상이(예: 미국은 연방 차원의 eID 미보급).
기술 접근성: 디지털 기기나 인터넷 접근이 제한된 사용자의 배제 가능.

eKYC의 사용

금융기관 및 핀테크 기업이 eKYC를 활용해 계좌 개설, 대출 신청, 보험 가입 등을 간소화합니다. 예를 들어, 모바일 뱅킹 앱에서 주민등록증/운전면허증 스캔 및 얼굴 인식으로 신원 확인을 완료합니다. 인터넷 보급률 증가와 생체 인식 기술 발전으로 eKYC 채택이 증가하고 있습니다.

eKYC는 전통적 KYC를 디지털화해 속도, 편의성, 보안을 향상시킨 현대적 신원 확인 방식입니다. 금융, 통신, 헬스케어 등 다양한 산업에서 필수적이며, 특히 규제 준수와 사기 방지에 기여합니다. 다만, 데이터 보안과 기술 접근성을 고려한 지속적인 개선이 필요합니다.

경량 LLM 멀티모달 모델

최근 공개된 Gemma 3와 Granite 3.2 Vision 모델을 사용할 것입니다. 그래서 간단하게 주요 특징과 용도를 정리했습니다.

Gemma 3

개요: Google이 개발한 경량 오픈소스 멀티모달 모델로, Gemini 기술을 기반으로 합니다. 비전 변형(4B, 12B, 27B 파라미터)은 텍스트와 이미지를 처리하며, 이미지 설명, 질문 응답, 시각 이해 등의 작업을 지원합니다.
주요 특징:
- 멀티모달: 텍스트와 이미지를 처리하며, 128K 토큰 컨텍스트 창과 140개 이상의 언어를 지원.
- 효율성: 모바일 기기나 워크스테이션 같은 소비자 하드웨어에서도 실행 가능.
- 성능: 27B 모델은 Gemini-1.5-Pro와 경쟁하며, 요약, 추론, 코드 생성에서 강력한 성능.
- 구현: Hugging Face의 Gemma3ForConditionalGeneration과 AutoProcessor를 사용해 이미지 기반 작업(예: 이미지 설명, 텍스트 추출) 수행.
용도: 챗봇, 코딩 도우미, 연구 도구, 이미지 캡셔닝, 시각 Q&A 등 멀티모달 애플리케이션.
접근성: Hugging Face, Ollama, Google AI Studio에서 제공되며, 오픈 모델 가중치로 커스터마이징 가능.

Granite 3.2 Vision

개요: IBM이 개발한 2B 파라미터의 오픈소스 비전-언어 모델로, Apache 2.0 라이선스 하에 기업용으로 설계되었으며, 특히 시각적 문서 이해에 특화.
주요 특징:
- 문서 이해: 표, 차트, 인포그래픽, 텍스트 중심 레이아웃 등에서 콘텐츠 추출에 최적화, 강력한 OCR(광학 문자 인식) 기능 포함.
- 효율성: 소규모 모델로 빠른 성능과 낮은 비용 제공, 기업 애플리케이션에 적합.
- 훈련: 다양한 시각 데이터(폰트, 레이아웃, 차트)로 훈련되어 복잡한 문서 형식 처리 가능.
용도: 문서 처리 자동화, 재무 보고서, 송장, 프레젠테이션 데이터 추출, 기업용 AI 비서 구축.
접근성: Hugging Face, IBM watson.ai 등에서 제공되며, 오픈소스로 파인튜닝 가능.

비교

범위: Gemma 3 Vision은 텍스트와 이미지 작업을 위한 범용 멀티모달 모델이고, Granite 3.2 Vision은 문서 이해에 특화된 기업용 모델.
규모: Gemma 3는 최대 27B로 다양한 작업에 적합, Granite 3.2는 2B로 작고 효율적.
용도: Gemma 3는 다목적 AI 도구 개발에, Granite 3.2는 문서 분석이 필요한 비즈니스에 적합.

DeepFace

DeepFace는 얼굴 인식 및 분석을 위한 오픈소스 Python 라이브러리로, 딥러닝 기반의 다양한 얼굴 관련 작업을 지원합니다. 주로 신원 확인, 감정 분석, 속성 추출 등을 위해 사용되며, eKYC(전자 고객 확인)와 같은 애플리케이션에서 활용됩니다.

주요 특징

얼굴 인식: 두 얼굴 이미지가 동일인인지 확인(1:1 매칭)하거나, 데이터베이스에서 얼굴을 식별(1:N 매칭).
분석 기능:
- 감정 분석(예: 행복, 슬픔, 분노).
- 나이, 성별, 인종 추정.
- 얼굴 속성(예: 안경 착용 여부).
안티-스푸핑: 사진이나 영상으로 위조된 얼굴(스푸핑)을 탐지.
백엔드 지원: VGG-Face, FaceNet, ArcFace 등 여러 딥러닝 모델 지원.
효율성: 사전 훈련된 모델을 사용해 빠르게 구현 가능.
호환성: OpenCV, TensorFlow, PyTorch와 통합 가능.

주요 기능

DeepFace.verify(): 두 이미지 간 얼굴 일치 여부 확인.
DeepFace.analyze(): 감정, 나이, 성별 등 얼굴 속성 분석.
DeepFace.stream(): 실시간 웹캠 스트리밍으로 얼굴 인식 및 분석.
DeepFace.find(): 데이터베이스에서 얼굴 매칭.

용도

eKYC: 금융 및 통신사의 신원 확인(예: 계좌 개설, SIM 등록).
보안: 출입 통제, 감시 시스템.
마케팅: 고객 감정 분석, 타겟팅.
헬스케어: 환자 신원 확인.

한계

정확도: 조명, 각도, 이미지 품질에 따라 결과가 달라질 수 있음.
프라이버시: 생체 데이터 처리로 인해 데이터 보호 규제(GDPR 등) 준수 필요.
리소스: 일부 모델은 GPU가 없으면 느릴 수 있음.

DeepFace는 얼굴 인식과 분석을 간편하게 구현할 수 있는 강력한 도구로, eKYC, 보안, 마케팅 등 다양한 분야에서 활용됩니다. 오픈소스라 커스터마이징이 가능하며, 사전 훈련된 모델로 빠른 시작이 가능합니다.

로컬 터널링 서비스 Localtunnel

로컬 터널링 서비스인 Localtunnel은 내부에서 실행 중인 서비스를 외부로 노출하는 데 사용됩니다. 외부 네트워크에서 로컬 PC에 접속하는 데 유용합니다.

Localtunnel의 용도

로컬 서버를 외부에서 접속할 수 있도록 해줍니다
외부 네트워크에서 로컬 PC에 접속할 수 있도록 해줍니다

Localtunnel의 사용법

npm이 설치되어 있다면, npm install -g localtunnel을 실행합니다
p는 포트를 지정하고, s는 서브도메인을 지정합니다

Localtunnel과 유사한 도구로는 ngrok이 있습니다

ngrok은 공유기 포트 포워딩 설정보다 간단하게 외부 인터넷망에서 로컬 PC로 접속할 수 있도록 해줍니다
ngrok은 애플리케이션을 실행하는 동안 HTTP 요청 로그를 확인할 수 있습니다

Demo 구현

Colab의 런타임유형을 하드웨어 가속기 T4 GPU로 변경합니다.

오른쪽 상단의 콤보박스를 클릭해서 "런타임 유형 변경" 매뉴로 들어갑니다.

"런타임 유형 변경" 매뉴를 클릭하면, 아래 그림과 같이 "하드웨어 가속기"를 선택할수있습니다. 여기서 기본 설정인 CPU 대신 T4 GPU를 선택합니다.

T4 GPU를 선택하면 기존 CPU 런타임 연결을 해제 및 삭제하고 T4 GPU 런타임 속성으로 세로운 세션을 시작합니다.

"확인" 버튼을 누르면 T4로 세로운 세션의 런타임이 구동됩니다.

그럼, 순차적으로 새로운 런타임에 맞는 필요한 설정들을 시작합니다. 첫번째로 현재 작업디렉토리와 T3 GPU가 런타임의 GPU 옵션을 확인합니다.

# 현재 경로 확인
!pwd
!nvidia-smi

Colab에서 마운트된 디렉토리는 런타임이 연결해제되면 모든 파일들이 삭제됩니다. 반복된 작업을 위해서 별도의 Google Drive 디렉토리를 마운트해서 사용하면 런타임을 효율적으로 해제하고 다시연결해도 작업중인 파일들을 잃어버리지 않을수 있습니다.

# colab drive 마운트
from google.colab import drive
drive.mount('/content/drive')

다음과 같은 팝업으로 Google Drive엑세스 여부를 묻는다면 "Google Drive 연결"을 클릭하고 자신의 Google 계정을 선택해서 본인의 Google Drive로 연결합니다.

추가적인 3단계의 Google 계성선택 및 접속옵션을 선택하고 나면 "/content/drive"가 마운트됩니다.

미리 자신의 Google Drive에 만들어놓은 프로젝트 Directory로 Base 디렉토리 변경를 변경합니다.

## 작업 디렉토리로 이동 (chdir)
import os
os.chdir("/content/drive/MyDrive/ekyc_with_llm/")

현재 디렉토리를 확인해보고 디렉토리에 포함된 소스파일들을 확인합니다.

# chdir된 경로 및 파일확인
!pwd
!ls -al

저의 경우 로컬에서 개발하던 소스와 파일들을 Google Drive에 업로드 했습니다.

필요한 라이브러리를 설치합니다.

# 필요한 라이브러리 및 Ollama 설치
!curl -fsSL https://ollama.com/install.sh | sh
!pip install ollama
!pip install deepface
!pip install -q streamlit
!pip install -q streamlit_option_menu

설치가완료되면 잘설치되었는지 pip 목록을 확인합니다.

!pip list | grep "ollama\|deepface\|streamlit"

필요한 라이브러리가 잘 설치되면 첫번재로 Ollama 서버를 백그라운드로 시작합니다.

# nohup을 사용하여 백그라운드에서 Ollama를 시작
!nohup ollama serve > ollama.log &

경량 멀티모달 모델 gemma3 4b와 granite3.2-vision 2b 모델을 받습니다.

!ollama list
!ollama pull gemma3:4b
!ollama pull granite3.2-vision
!ollama list

localtunnel을 설치합니다.

!npm install -g localtunnel

streamlit 을 백그라운드로 실행합니다.

!streamlit run app.py &>logs.txt &

Collecting usage statistics. To deactivate, set browser.gatherUsageStats to false.


  You can now view your Streamlit app in your browser.

  Local URL: http://localhost:8501
  Network URL: http://172.28.0.12:8501
  External URL: http://34.125.235.113:8501
  
  2025-04-22 03:20:36.026521: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
E0000 00:00:1745292036.049572    4667 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1745292036.057146    4667 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2025-04-22 03:20:36.081008: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.

Streamlit 구동 포트를 localtunnel로 외부서비스 연결합니다.

!npx localtunnel --port 8501 & curl ipv4.icanhazip.com

아래 your url을 클릭하거나 복사해서 브라우져를 띄웁니다.

앞서 localtunnel을 띄울때 curl 명령으로 구동되는 서버의 외부 공용 IP를 가져왔는데 그 아이피를 Tunnel Password로 입력합니다.

아이피를 복사해서 입력후 "Click to Submit" 합니다.

첫페이지인 ID Card를 등록페이지 입니다. 로컬파일등록 또는 사진으로 신분증을찍어서 업로드 할수있습니다.

Browse files로 파일을 선택합니다.

업로드한 신분증은 인터넷의 필리핀 일반 운전면허증(Driver's License) 에 제 사진을 복사해서 샘플로 만들었습니다.

악의적으로 신분증 위변조를 검사할수있는 방법도 구현해야할 항목중하나 입니다.

혹시 방법을 아신다면 댓글로 말씀해주시면 도움이 됩니다.

"Click to OCR!" 버튼을 누르면 신분증으로 OCR을 수행합니다.

Google Colab T4 기준 10초 걸렸습니다. 개인 M1 으로 테스트했을때 첫번째 수행시 60 ~ 70초 정도 소요되었었는데 (두번째 부터는 2~30초) 확실히 T4가 빠릅니다.

운전면허증에 포함된 정보를 정확하게 추출해옵니다.

{
    "license": "Non-Professional Driver's License",
    "name": "DELA CRUZ, JUAN PEDRO GARCIA",
    "nationality": "PHL",
    "sex": "M",
    "dateOfBirth": "1987/10/04",
    "weight": "70",
    "height": "1.55",
    "address": "UNIT/HOUSE NO. BUILDING, STREET NAME, BARANGAY, CITY/MUNICIPALITY",
    "licenseNumber": "N03-12-123456",
    "expirationDate": "2022/10/04",
    "agencyCode": "N32",
    "bloodType": "O+",
    "eyesColor": "BLACK",
    "Restrictions": "NONE",
    "conditions": "1.2"
}

두번째로 Self 인증을 위해 본인사진을 찍도록합니다.

"Take Photo" 클릭하면 본인얼굴로 사진찍어 업로드되고 앞서 업로드한 신분증의 사진과 Selfie 사진을 비교하여 동일인지를 판단합니다.

Verfification 결과 distance가 threshold값보다 작으면 유사도가 높은것으로 간주해서 "True"를 리턴합니다.

{
    "verified": true,
    "distance": 0.47204673328605784,
    "threshold": 0.68,
    "model": "VGG-Face",
    "detector_backend": "retinaface",
    "similarity_metric": "cosine",
    "facial_areas": {
        "img1": {
            "x": 0,
            "y": 0,
            "w": 104,
            "h": 130,
            "left_eye": null,
            "right_eye": null
        },
        "img2": {
            "x": 0,
            "y": 0,
            "w": 132,
            "h": 178,
            "left_eye": null,
            "right_eye": null
        }
    },
    "time": 1.98
}

이어서 추가할 기능

위조신분증여부 판단
Face Anti-Spoofing
UI에서 신분증 위치 인식 및 사각형표시 (별도 opencv 기능으로)
Selfie에서 안면인식 및 타원표시 (별도 opencv 기능으로)

참고

Face Anti-Spoofing 안면 인식은 얼굴을 통해 사람을 식별하거나 특정인의 신원을 인증하는 방법

st.camera_input 사용자의 웹캠에서 사진을 반환하는 위젯을 표시

728x90

'AI > Multimodal' 카테고리의 다른 글

[eKYC] M1 LLM & OCR & granite3.2-vision (0)	2025.02.24
Granite Vision 3.2 2B: 멀티모달로 전환되는 Granite (0)	2025.02.24
[eKYC] DeepFace: 오픈소스 얼굴 인식(Facial Recognition) 라이브러리 (Github 번역 및 스터디) (0)	2025.02.24

Google Colab에서 Ollama를 사용하여 Gemma 모델을 실행

a_mnesia 2024. 9. 22. 08:31

2024. 9. 22. 08:31

728x90

Gemma - Run with Ollama (with Colab)

Google Colab에서 Ollama로 Gemma 실행하기

이 노트북은 Ollama를 사용하여 Gemma 모델에서 추론을 실행하는 방법을 보여줍니다 . Ollama는 로컬에서 LLM을 실행하기 위한 사용하기 쉬운 솔루션이며 Gemma가 기본으로 지원됩니다.

Gemma는 Google DeepMind에서 구축한 경량의 최첨단 개방형 모델 제품군입니다. 버전 1.1로 업데이트되었습니다.

Gemma는 2b 및 7b 매개변수 크기로 제공됩니다.

ollama run gemma:2b
ollama run gemma:7b (default)

gemma 모델정보 확인

Colab 링크

Ollama 설치

공식 설치 스크립트를 통해 Ollama를 설치합니다.

!curl -fsSL https://ollama.com/install.sh | sh

!curl -fsSL https://ollama.com/install.sh | sh
>>> Installing ollama to /usr/local
>>> Downloading Linux amd64 bundle
############################################################################################# 100.0%
>>> Creating ollama user...
>>> Adding ollama user to video group...
>>> Adding current user to ollama group...
>>> Creating ollama systemd service...
WARNING: Unable to detect NVIDIA/AMD GPU. Install lspci or lshw to automatically detect and install GPU dependencies.
>>> The Ollama API is now available at 127.0.0.1:11434.
>>> Install complete. Run "ollama" from the command line.

Ollama 시작 (실행)

nohup을 사용하여 백그라운드에서 Ollama를 시작합니다. API 서비스 포트는 11434 입니다.

!nohup ollama serve > ollama.log &

nohup: redirecting stderr to stdout

Ollama 모델 목록 확인

!ollama list

NAME    ID    SIZE    MODIFIED

추론

명령행(command line)을 이용해서 gemma 7B 모델을 받고 추론을 실행

!ollama run gemma:7b "대한민국의 수도는 어디야?" 2> ollama.log

대한민국의 수도는 서울입니다.</end_of_turn>

영어로 질문해봅니다.

ollama run gemma:7b "What is the capital of Korea?" 2> ollama.log

The capital of Korea is **Seoul**. It is the largest and most populous city in the country.

실행한 모델 확인

!ollama list

NAME        ID              SIZE      MODIFIED       
gemma:7b    a72c7f4d0a15    5.0 GB    41 seconds ago

REST 엔드포인트를 통해 응답 생성

!curl http://localhost:11434/api/generate -d '{ \
  "model": "gemma:7b", \
  "prompt":"프랑스의 수도는 어디야?" \
}'

{"model":"gemma:7b","created_at":"2024-09-20T07:43:59.191815259Z","response":"프","done":false}
{"model":"gemma:7b","created_at":"2024-09-20T07:43:59.238195689Z","response":"랑","done":false}
{"model":"gemma:7b","created_at":"2024-09-20T07:43:59.284446079Z","response":"스","done":false}
{"model":"gemma:7b","created_at":"2024-09-20T07:43:59.330415772Z","response":"의","done":false}
{"model":"gemma:7b","created_at":"2024-09-20T07:43:59.376395312Z","response":" 수","done":false}
{"model":"gemma:7b","created_at":"2024-09-20T07:43:59.404399661Z","response":"도","done":false}
{"model":"gemma:7b","created_at":"2024-09-20T07:43:59.431299982Z","response":"는","done":false}
{"model":"gemma:7b","created_at":"2024-09-20T07:43:59.45782401Z","response":" 파","done":false}
{"model":"gemma:7b","created_at":"2024-09-20T07:43:59.487122476Z","response":"리","done":false}
{"model":"gemma:7b","created_at":"2024-09-20T07:43:59.514609513Z","response":"(","done":false}
{"model":"gemma:7b","created_at":"2024-09-20T07:43:59.540247914Z","response":"Paris","done":false}
{"model":"gemma:7b","created_at":"2024-09-20T07:43:59.565826889Z","response":")","done":false}
{"model":"gemma:7b","created_at":"2024-09-20T07:43:59.592181701Z","response":"입니다","done":false}
{"model":"gemma:7b","created_at":"2024-09-20T07:43:59.619404725Z","response":".","done":false}
{"model":"gemma:7b","created_at":"2024-09-20T07:43:59.646506512Z","response":"","done":true,"done_reason":"stop","context":[968,2997,235298,559,235298,15508,235313,1645,108,237964,239566,236354,236137,22618,236840,236214,53355,238173,238305,181537,615,235298,559,235298,15508,235313,108,235322,2997,235298,559,235298,15508,235313,2516,108,237964,239566,236354,236137,22618,236840,236214,59994,236432,235278,29437,235275,47555,235265],"total_duration":650239442,"load_duration":39196479,"prompt_eval_count":37,"prompt_eval_duration":114387000,"eval_count":15,"eval_duration":454674000}

REST 엔드포인트를 통해 Gemma와 채팅

!curl http://localhost:11434/api/chat -d '{ \
  "model": "gemma:7b", \
  "messages": [ \
    { "role": "user", "content": "스페인의 수도는 어디야?" } \
  ] \
}'

{"model":"gemma:7b","created_at":"2024-09-20T07:44:05.627626051Z","message":{"role":"assistant","content":"스"},"done":false}
{"model":"gemma:7b","created_at":"2024-09-20T07:44:05.673755341Z","message":{"role":"assistant","content":"페"},"done":false}
{"model":"gemma:7b","created_at":"2024-09-20T07:44:05.719617417Z","message":{"role":"assistant","content":"인"},"done":false}
{"model":"gemma:7b","created_at":"2024-09-20T07:44:05.765324012Z","message":{"role":"assistant","content":"의"},"done":false}
{"model":"gemma:7b","created_at":"2024-09-20T07:44:05.810472803Z","message":{"role":"assistant","content":" 수"},"done":false}
{"model":"gemma:7b","created_at":"2024-09-20T07:44:05.842618919Z","message":{"role":"assistant","content":"도"},"done":false}
{"model":"gemma:7b","created_at":"2024-09-20T07:44:05.865453528Z","message":{"role":"assistant","content":"는"},"done":false}
{"model":"gemma:7b","created_at":"2024-09-20T07:44:05.892675721Z","message":{"role":"assistant","content":" 마"},"done":false}
{"model":"gemma:7b","created_at":"2024-09-20T07:44:05.922734686Z","message":{"role":"assistant","content":"드"},"done":false}
{"model":"gemma:7b","created_at":"2024-09-20T07:44:05.94894934Z","message":{"role":"assistant","content":"리"},"done":false}
{"model":"gemma:7b","created_at":"2024-09-20T07:44:05.974847995Z","message":{"role":"assistant","content":"드"},"done":false}
{"model":"gemma:7b","created_at":"2024-09-20T07:44:06.000792996Z","message":{"role":"assistant","content":"입니다"},"done":false}
{"model":"gemma:7b","created_at":"2024-09-20T07:44:06.027347034Z","message":{"role":"assistant","content":"."},"done":false}
{"model":"gemma:7b","created_at":"2024-09-20T07:44:06.054820407Z","message":{"role":"assistant","content":""},"done_reason":"stop","done":true,"total_duration":612398322,"load_duration":38850549,"prompt_eval_count":37,"prompt_eval_duration":104264000,"eval_count":14,"eval_duration":427080000}

728x90

'AI > LLM-Gemma' 카테고리의 다른 글

Ollama gemma3 4B 모델로 vision & Image 분석해보기 (0)	2025.03.16
Colab에서 T4 GPU 런타임설정 및 Kaggle Credential Key 설정 (0)	2024.09.19
Mac Intel CPU Ollama gemma2 (1)	2024.09.06

Mac Intel CPU Ollama gemma2

a_mnesia 2024. 9. 6. 00:35

2024. 9. 6. 00:35

728x90

M1 교체 이전에 사용하던 MacBook Pro 2015에서 Ollama로 Gemma2를 구동해봤습니다.

Processor: Multi-core CPU (Intel i5/i7/i9 or AMD equivalent)
Memory: You should have at least 8 GB of RAM available to run the 7B models, 16 GB to run the 13B models, and 32 GB to run the 33B models.
Storage: At least 20 GB of free space on an SSD
Operating System: Recent versions of Windows, macOS, or Linux

Python 3.11.4로 생성된 Conda 가상환경으로 실행합니다.

(base) Gabrielui-MacBookPro:~ gabriel$ conda env list
# conda environments:
#
base                  *  /usr/local/anaconda3
gabriel                  /usr/local/anaconda3/envs/gabriel

(base) Gabrielui-MacBookPro:~ gabriel$ conda activate gabriel
(gabriel) Gabrielui-MacBookPro:~ gabriel$

(gabriel) Gabrielui-MacBookPro:~ gabriel$ python -V
Python 3.11.4

세 가지 크기: 2B, 9B 및 27B 파라미터의 모델

2B Parameters ollama run gemma2:2b
9B Parameters ollama run gemma2 (Default)
27B Parameters ollama run gemma2:27b

셋중에서 9B 모델이 Default 입니다.

(gabriel) Gabrielui-MacBookPro:~ gabriel$ ollama --version
ollama version is 0.3.9

(gabriel) Gabrielui-MacBookPro:~ gabriel$ ollama run gemma2
pulling manifest
pulling ff1d1fc78170... 100% ▕█████████████████████████████████████████████████████████████████▏ 5.4 GB
pulling 109037bec39c... 100% ▕█████████████████████████████████████████████████████████████████▏  136 B
pulling 097a36493f71... 100% ▕█████████████████████████████████████████████████████████████████▏ 8.4 KB
pulling 2490e7468436... 100% ▕█████████████████████████████████████████████████████████████████▏   65 B
pulling 10aa81da732e... 100% ▕█████████████████████████████████████████████████████████████████▏  487 B
verifying sha256 digest
writing manifest
success
>>>

다운로드한 모델을 확인합니다.

(gabriel) Gabrielui-MacBookPro:~ gabriel$ ollama list
NAME         	ID          	SIZE  	MODIFIED
gemma2:latest	ff02c3702f32	5.4 GB	14 minutes ago
(gabriel) Gabrielui-MacBookPro:~ gabriel$

Ollama + Gemma2 (9B)

ollama run gemma2 로 다운로드한 모델의 정보를 확인합니다.

>>> /show info
  Model
  	arch            	gemma2
  	parameters      	9.2B
  	quantization    	Q4_0
  	context length  	8192
  	embedding length	3584

  Parameters
  	stop	"<start_of_turn>"
  	stop	"<end_of_turn>"

  License
  	Gemma Terms of Use
  	Last modified: February 21, 2024

>>>

프롬프트로 질문 "why is the sky blue?"

>>> why is the sky blue?
The sky appears blue due to a phenomenon called **Rayleigh scattering**.

Here's a breakdown:

* **Sunlight:** Sunlight is made up of all the colors of the rainbow.
* **Atmosphere:** When sunlight enters Earth's atmosphere, it collides with tiny air molecules (mainly nitrogen and
oxygen).
* **Scattering:** These molecules scatter the sunlight in all directions.
* **Wavelengths:**  Blue light has a shorter wavelength than other colors in the visible spectrum. Shorter wavelengths are
scattered more effectively by the air molecules.

Therefore, blue light gets scattered much more than other colors, making the sky appear blue to our eyes.

**At sunrise and sunset:** The sunlight travels through more atmosphere to reach us. This means even more blue light is
scattered away, allowing longer wavelengths like orange and red to dominate, resulting in those beautiful hues.

>>> Send a message (/? for help)
Use Ctrl + d or /bye to exit.

REST 엔드포인트를 통해 응답 생성

(gabriel) Gabrielui-MacBookPro:~ gabriel$ curl http://localhost:11434/api/generate -d '{
>   "model": "gemma2",
>   "prompt": "Why is the sky blue?"
> }'
{"model":"gemma2","created_at":"2024-09-05T15:30:49.100401Z","response":"The","done":false}
{"model":"gemma2","created_at":"2024-09-05T15:30:49.464952Z","response":" sky","done":false}
{"model":"gemma2","created_at":"2024-09-05T15:30:49.798186Z","response":" appears","done":false}
{"model":"gemma2","created_at":"2024-09-05T15:30:50.143292Z","response":" blue","done":false}



{"model":"gemma2","created_at":"2024-09-05T15:32:05.437344Z","response":" have","done":false}
{"model":"gemma2","created_at":"2024-09-05T15:32:05.785728Z","response":" any","done":false}
{"model":"gemma2","created_at":"2024-09-05T15:32:06.14108Z","response":" other","done":false}
{"model":"gemma2","created_at":"2024-09-05T15:32:06.517148Z","response":" questions","done":false}
{"model":"gemma2","created_at":"2024-09-05T15:32:06.926336Z","response":"!","done":false}
{"model":"gemma2","created_at":"2024-09-05T15:32:07.269795Z","response":"","done":true,"done_reason":"stop","context":[106,1645,108,4385,603,573,8203,3868,235336,107,108,106,2516,108,651,8203,8149,3868,3402,577,476,27171,3151,5231,15599,44957,38497,168428,235248,109,4858,235303,235256,476,25497,235292,109,235287,5231,219715,66058,175521,603,4247,1644,908,576,832,573,9276,576,573,30088,235265,108,235287,5231,30962,235303,235256,117961,66058,3194,33365,30866,573,10379,235303,235256,13795,235269,665,1041,23524,675,16791,2681,24582,591,80711,23584,578,16175,846,108,235287,5231,102164,574,66058,3766,24582,17109,573,33365,575,832,16759,235265,235248,108,235287,5231,10716,181430,235256,66058,7640,2611,919,476,25270,35571,1178,1156,9276,575,573,12918,18303,235265,10323,576,736,235269,665,6803,30390,1683,978,731,573,2681,24582,1178,5543,95178,1154,3118,689,10436,235265,109,688,2339,235269,1212,783,1443,66058,109,651,30390,3868,2611,26676,1167,4628,774,832,16759,235269,3547,573,8203,4824,3868,235265,2625,48825,578,22097,235269,573,33365,37845,1593,978,576,573,13795,235265,1417,3454,573,3868,2611,603,30390,3024,1693,978,235269,15267,573,5543,95178,591,165117,578,72638,235275,577,6378,1167,4628,235269,10241,1941,4964,9276,235265,110,5331,682,1230,1013,692,791,1089,1156,3920,235341],"total_duration":81320587916,"load_duration":42868270,"prompt_eval_count":15,"prompt_eval_duration":3107284000,"eval_count":210,"eval_duration":78169448000}
(gabriel) Gabrielui-MacBookPro:~ gabriel$

REST 엔드포인트를 통해 Gemma와 채팅

(gabriel) Gabrielui-MacBookPro:~ gabriel$ curl http://localhost:11434/api/chat -d '{
>   "model": "gemma2",
>   "messages": [
>     {
>       "role": "user",
>       "content": "why is the sky blue?"
>     }
>   ]
> }'
{"model":"gemma2","created_at":"2024-09-05T15:32:34.118301Z","message":{"role":"assistant","content":"The"},"done":false}
{"model":"gemma2","created_at":"2024-09-05T15:32:34.466319Z","message":{"role":"assistant","content":" sky"},"done":false}
{"model":"gemma2","created_at":"2024-09-05T15:32:34.821353Z","message":{"role":"assistant","content":" appears"},"done":false}
{"model":"gemma2","created_at":"2024-09-05T15:32:35.182394Z","message":{"role":"assistant","content":" blue"},"done":false}
{"model":"gemma2","created_at":"2024-09-05T15:32:35.523289Z","message":{"role":"assistant","content":" due"},"done":false}

{"model":"gemma2","created_at":"2024-09-05T15:33:50.644094Z","message":{"role":"assistant","content":" if"},"done":false}
{"model":"gemma2","created_at":"2024-09-05T15:33:51.039595Z","message":{"role":"assistant","content":" you"},"done":false}
{"model":"gemma2","created_at":"2024-09-05T15:33:51.411561Z","message":{"role":"assistant","content":" have"},"done":false}
{"model":"gemma2","created_at":"2024-09-05T15:33:51.815174Z","message":{"role":"assistant","content":" any"},"done":false}
{"model":"gemma2","created_at":"2024-09-05T15:33:52.152768Z","message":{"role":"assistant","content":" other"},"done":false}
{"model":"gemma2","created_at":"2024-09-05T15:33:52.492949Z","message":{"role":"assistant","content":" questions"},"done":false}
{"model":"gemma2","created_at":"2024-09-05T15:33:52.915517Z","message":{"role":"assistant","content":"!"},"done":false}
{"model":"gemma2","created_at":"2024-09-05T15:33:53.27064Z","message":{"role":"assistant","content":""},"done_reason":"stop","done":true,"total_duration":80946825634,"load_duration":44007406,"prompt_eval_count":15,"prompt_eval_duration":1749943000,"eval_count":190,"eval_duration":79152305000}
(gabriel) Gabrielui-MacBookPro:~ gabriel$

Ollama + Gemma 1.1 (2B)

(gabriel) Gabrielui-MacBookPro:~ gabriel$ ollama run gemma:2b
pulling manifest
pulling manifest
pulling manifest
pulling manifest
pulling manifest
pulling manifest
pulling manifest
pulling manifest
pulling manifest
pulling manifest
pulling manifest
pulling manifest
pulling manifest
pulling manifest
pulling c1864a5eb193... 100% ▕██████████████████████████████████████████████████████████████████████████████████▏ 1.7 GB
pulling 097a36493f71... 100% ▕██████████████████████████████████████████████████████████████████████████████████▏ 8.4 KB
pulling 109037bec39c... 100% ▕██████████████████████████████████████████████████████████████████████████████████▏  136 B
pulling 22a838ceb7fb... 100% ▕██████████████████████████████████████████████████████████████████████████████████▏   84 B
pulling 887433b89a90... 100% ▕██████████████████████████████████████████████████████████████████████████████████▏  483 B
verifying sha256 digest
writing manifest
success
>>> Send a message (/? for help)

다운로드한 모델을 확인합니다.

(gabriel) Gabrielui-MacBookPro:~ gabriel$ ollama list
NAME         	ID          	SIZE  	MODIFIED
gemma:2b     	b50d6c999e59	1.7 GB	3 minutes ago
gemma2:latest	ff02c3702f32	5.4 GB	40 minutes ago
(gabriel) Gabrielui-MacBookPro:~ gabriel$

Gemma2와 동일한 질문을 합니다. "why is the sky blue?"

>>> why is the sky blue?
The sky appears blue due to Rayleigh scattering.

**Rayleigh scattering** is a scattering phenomenon where light waves are scattered in different directions depending on their wavelength.
Blue light has a shorter wavelength than other colors of light, meaning it scatters more effectively. This is why the sky appears blue.

Here's a more detailed explanation:

* **Sunlight** contains all colors of the rainbow.
* **Blue light** has a shorter wavelength than other colors.
* **When blue light** interacts with molecules in the atmosphere, it **scatters more strongly** than other colors.
* This scattering causes blue light to be scattered **towards the observer**.
* The scattered blue light is what we see in the sky as **blue color**.

**Other factors that contribute to the blue color of the sky:**

* **Dust particles:** Air molecules can also scatter light, but their scattering effect is generally weaker than that of molecules in the
atmosphere.
* **Water droplets:** Water droplets in the atmosphere can also scatter light, but their effect is typically less significant than that of
molecules.
* **Sun's position:** The sky appears blue because the sun is lower in the sky, meaning it is directly overhead. This means that more blue
light reaches our eyes.

In summary, the blue color of the sky is due to Rayleigh scattering of light by molecules in the atmosphere, which primarily results from
the shorter wavelength of blue light.

>>> Send a message (/? for help)

실행시에 MacBook의 성능을 확인합니다.

실행한 모델 정보 확인

>>> /show info
  Model
  	arch            	gemma
  	parameters      	3B
  	quantization    	Q4_0
  	context length  	8192
  	embedding length	2048

  Parameters
  	repeat_penalty	1
  	stop          	"<start_of_turn>"
  	stop          	"<end_of_turn>"

  License
  	Gemma Terms of Use
  	Last modified: February 21, 2024

>>> Send a message (/? for help)

Ollama + Gemma 1.1 (7B)

이번엔 Gemma 7B 모델을다운받아 테스트합니다.

(gabriel) Gabrielui-MacBookPro:~ gabriel$ ollama run gemma:7b
pulling manifest
pulling ef311de6af9d...  74% ▕█████████████████████████████████████████
pulling manifest
pulling ef311de6af9d...  74% ▕█████████████████████████████████████████
pulling manifest
pulling ef311de6af9d...  74% ▕█████████████████████████████████████████
pulling manifest
pulling ef311de6af9d... 100% ▕███████████████████████████████████████████████████████████████▏ 5.0 GB
pulling 097a36493f71... 100% ▕███████████████████████████████████████████████████████████████▏ 8.4 KB
pulling 109037bec39c... 100% ▕███████████████████████████████████████████████████████████████▏  136 B
pulling 65bb16cf5983... 100% ▕███████████████████████████████████████████████████████████████▏  109 B
pulling 0c2a5137eb3c... 100% ▕███████████████████████████████████████████████████████████████▏  483 B
verifying sha256 digest
writing manifest
success
>>> Send a message (/? for help)

다운로드한 모델 확인

(gabriel) Gabrielui-MacBookPro:~ gabriel$ ollama list
NAME             ID              SIZE      MODIFIED
gemma:7b         a72c7f4d0a15    5.0 GB    2 weeks ago
gemma:2b         b50d6c999e59    1.7 GB    2 weeks ago
gemma2:latest    ff02c3702f32    5.4 GB    2 weeks ago

Gemma2와 동일한 질문을 합니다. "why is the sky blue?"

>>> why is the sky blue?
**The sky is blue due to a phenomenon called Rayleigh scattering.**

* Sunlight is composed of all the colors of the rainbow, each with a different wavelength.
* Shorter wavelengths of light, such as blue light, have higher energy and more tightly spaced photons.
* When sunlight interacts with molecules in the atmosphere, such as nitrogen and oxygen, the shorter wavelengths are
scattered more effectively.

**Here's how it works:**

1. Sunlight enters the atmosphere and interacts with molecules of nitrogen and oxygen.
2. These molecules have electrons that can be excited by the energy of the incoming light.
3. The electrons absorb the energy and vibrate, then emit it in all directions.
4. However, **blue light is scattered more efficiently than other colors** because its shorter wavelength allows for more
frequent collisions between the photons and the molecules.

**This scattering process results in:**

* More blue light is scattered in all directions, reaching our eyes and making the sky appear blue.
* Longer wavelengths of light, such as red light, are scattered less efficiently, so we see more of them directly from
the sun.

**Additional factors influencing the sky's color:**

* **Time of day:** The sky is generally bluer at midday when the sunlight travels through less atmosphere.
* **Cloud cover:** Clouds can block the sunlight and scatter less blue light, making the sky appear less blue.
* **Pollution:** Pollution in the atmosphere can also scatter light and affect the sky's color.

>>> Send a message (/? for help)

실행한 모델 정보 확인

>>> /show info
  Model
  	arch            	gemma
  	parameters      	9B
  	quantization    	Q4_0
  	context length  	8192
  	embedding length	3072

  Parameters
  	penalize_newline	false
  	repeat_penalty  	1
  	stop            	"<start_of_turn>"
  	stop            	"<end_of_turn>"

  License
  	Gemma Terms of Use
  	Last modified: February 21, 2024

>>> Send a message (/? for help)

질의 성능 확인

Intel CPU의 MacBook에서 Gemma2 9B 모델 그리고 Gemma 2B, 7B 모델까지 Ollama로 구동해 봤습니다.

감사합니다.

728x90

'AI > LLM-Gemma' 카테고리의 다른 글

Ollama gemma3 4B 모델로 vision & Image 분석해보기 (0)	2025.03.16
Google Colab에서 Ollama를 사용하여 Gemma 모델을 실행 (0)	2024.09.22
Colab에서 T4 GPU 런타임설정 및 Kaggle Credential Key 설정 (0)	2024.09.19

M1 Ollama로 heegyu/EEVE-Korean-Instruct-10.8B-v1.0-GGUF 양자화 모델 테스트

a_mnesia 2024. 4. 20. 09:50

2024. 4. 20. 09:50

728x90

테디님이 올려놓으신 유튜브 영상을 보고 Macbook M1에서 따라해봤습니다. 중간에 몇가지 설정에 맞게 약간 변경해가면서 테스트 합니다.

https://www.youtube.com/watch?v=VkcaigvTrug&t=23s

# GPU 모니터링
% sudo asitop

# LangServe 실행
% python server.py

# ngrok으로 external 서비스
% ngrok http --domain=humble-curiously-antelope.ngrok-free.app 8000

# PDF를 RAG
% streamlit run main.py

conda 가상환경 만들고 requirements.txt로 필요한 모듈 설치

conda create -n llm311 python=3.11

% conda env list
# conda environments:
#
base                  *  /Users/dongsik/miniconda
llm                      /Users/dongsik/miniconda/envs/llm
llm311                   /Users/dongsik/miniconda/envs/llm311

% conda activate llm311

% python -V
Python 3.11.9

% pip list
Package    Version
---------- -------
pip        23.3.1
setuptools 68.2.2
wheel      0.41.2

예제 github을 내 github으로 fork 한후 내 PC에 clone 받아서 내환경에 맞게 수정하면서 진행합니다.

teddy github : https://github.com/teddylee777/langserve_ollama

내 github : https://github.com/dongshik/langserve_ollama

% ll
total 1000
drwxr-xr-x@ 12 dongsik  staff     384 Apr 20 09:22 .
drwxr-xr-x   4 dongsik  staff     128 Apr 19 16:35 ..
drwxr-xr-x@ 14 dongsik  staff     448 Apr 19 16:40 .git
-rw-r--r--@  1 dongsik  staff      50 Apr 19 16:35 .gitignore
-rw-r--r--@  1 dongsik  staff    3343 Apr 19 16:35 README.md
drwxr-xr-x@  8 dongsik  staff     256 Apr 19 16:35 app
drwxr-xr-x@  8 dongsik  staff     256 Apr 19 16:35 example
drwxr-xr-x@  3 dongsik  staff      96 Apr 19 16:35 images
drwxr-xr-x@  4 dongsik  staff     128 Apr 19 16:35 ollama-modelfile
-rw-r--r--@  1 dongsik  staff  481043 Apr 19 16:35 poetry.lock
-rw-r--r--@  1 dongsik  staff     659 Apr 19 16:35 pyproject.toml
-rw-r--r--@  1 dongsik  staff   14983 Apr 19 16:35 requirements.txt

pip install -r requirements.txt

% pip install -r requirements.txt
Ignoring colorama: markers 'python_version >= "3.11.dev0" and python_version < "3.12.dev0" and platform_system == "Windows"' don't match your environment

% pip list | grep lang
langchain                  0.1.16
langchain-community        0.0.32
langchain-core             0.1.42
langchain-openai           0.1.3
langchain-text-splitters   0.0.1
langchainhub               0.1.15
langdetect                 1.0.9
langserve                  0.0.51
langsmith                  0.1.47

% pip list | grep huggingface
huggingface-hub            0.22.2

Huggingface에서 모델 Download 받고 Ollama에 EEVE Q5 모델 등록하고 구동

huggingface-cli download \
  heegyu/EEVE-Korean-Instruct-10.8B-v1.0-GGUF \
  ggml-model-Q5_K_M.gguf \
  --local-dir /Users/dongsik/GitHub/teddylee777/langserve_ollama/ollama-modelfile/EEVE-Korean-Instruct-10.8B-v1.0 \
  --local-dir-use-symlinks False
  
Consider using `hf_transfer` for faster downloads. This solution comes with some limitations. See https://huggingface.co/docs/huggingface_hub/hf_transfer for more details.
downloading https://huggingface.co/heegyu/EEVE-Korean-Instruct-10.8B-v1.0-GGUF/resolve/main/ggml-model-Q5_K_M.gguf to /Users/dongsik/.cache/huggingface/hub/tmpkuuur4ki
ggml-model-Q5_K_M.gguf:  37%|███████████████████████████▌                                               | 2.81G/7.65G [04:35<09:55, 8.13MB/s]

% ll -sh
total 14954512
       0 drwxr-xr-x@ 5 dongsik  staff   160B Apr 20 10:02 .
       0 drwxr-xr-x@ 4 dongsik  staff   128B Apr 20 10:02 ..
       8 -rw-r--r--@ 1 dongsik  staff   369B Apr 19 16:35 Modelfile
       8 -rw-r--r--@ 1 dongsik  staff   419B Apr 19 16:35 Modelfile-V02
14954496 -rw-r--r--  1 dongsik  staff   7.1G Apr 20 10:02 ggml-model-Q5_K_M.gguf

<경로>/langserve_ollama/ollama-modelfile/EEVE-Korean-Instruct-10.8B-v1.0/Modelfile

FROM ggml-model-Q5_K_M.gguf

TEMPLATE """{{- if .System }}
<s>{{ .System }}</s>
{{- end }}
<s>Human:
{{ .Prompt }}</s>
<s>Assistant:
"""

SYSTEM """A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions."""

PARAMETER TEMPERATURE 0
PARAMETER stop <s>
PARAMETER stop </s>

모델파일 설정하지 않으면 답변이 끝났을때 이상하게 대답할수도있기때문에 필요합니다.

System prompt가 있다면 중간(.System) 위치에 넣어으라는 의미이며 여기서는 'SYSTEM'이 이자리를 치환하게 됩니다.

그다음 <s> 스페셜 토큰이 앞에 붙어서 사용자 즉 Human의 질문 .Prompt가 들어가게 됩니다.

그후 모델 Assistant가 받아서 답변하게 됩니다.

※ Note!!

Modelfile에서 <s>는 문장의 시작을 나타내는 특수 토큰입니다. 이것은 "문장의 시작"을 나타내기 위해 사용됩니다. 예를 들어, 자연어 처리 작업에서 모델이 문장의 시작을 식별하고, 이에 따라 적절한 처리를 수행할 수 있도록 합니다. 이것은 토큰화된 데이터의 일부로서 모델에 제공됩니다.

tokenizer.chat_template

{% if messages[0]['role'] == 'system' %}{% set loop_messages = messages[1:] %}{% set system_message = messages[0]['content'] %}{% else %}{% set loop_messages = messages %}{% set system_message = 'You are a helpful assistant.' %}{% endif %}{% if not add_generation_prompt is defined %}{% set add_generation_prompt = false %}{% endif %}{% for message in loop_messages %}{% if loop.index0 == 0 %}{{'<|im_start|>system ' + system_message + '<|im_end|> '}}{% endif %}{{'<|im_start|>' + message['role'] + ' ' + message['content'] + '<|im_end|>' + ' '}}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant ' }}{% endif %}

https://huggingface.co/yanolja/EEVE-Korean-Instruct-10.8B-v1.0

Prompt Template

A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.
Human: {prompt}
Assistant:

ollama 목록 확인

% ollama list
NAME    	ID          	SIZE  	MODIFIED
eeve:q4 	68f4c2c2d9fe	6.5 GB	7 days ago
gemma:2b	b50d6c999e59	1.7 GB	10 days ago

ollama가 잘 구동되어 있는지 확인합니다.

% ps -ef | grep ollama
  501  3715  3691   0 Wed01PM ??         0:29.51 /Applications/Ollama.app/Contents/Resources/ollama serve
  501  4430     1   0 Wed01PM ??         0:00.03 /Applications/Ollama.app/Contents/Frameworks/Squirrel.framework/Resources/ShipIt com.electron.ollama.ShipIt /Users/dongsik/Library/Caches/com.electron.ollama.ShipIt/ShipItState.plist
  501 61608  3197   0 10:45AM ttys002    0:00.01 grep ollama

새로받은 모델을 ollama에 등록합니다

ollama create eeve:q5 -f ollama-modelfile/EEVE-Korean-Instruct-10.8B-v1.0/Modelfile

저는 위의 Modelfile로 ollama등록할려고 하니 "Error: unknown parameter 'TEMPERATURE'"가 발생했습니다.
그래서 소문자 temperature로 변경해서 생성되었습니다.
만일 동일한 에러가 발생한다면 소문자 temperature로 변경해서 생성해보시기 바랍니다.

% ollama create eeve:q5 -f ollama-modelfile/EEVE-Korean-Instruct-10.8B-v1.0/Modelfile

transferring model data
creating model layer
creating template layer
creating system layer
creating parameters layer
creating config layer
using already created layer sha256:b9e3d1ad5e8aa6db09610d4051820f06a5257b7d7f0b06c00630e376abcfa4c1
writing layer sha256:6b70a2ad0d545ca50d11b293ba6f6355eff16363425c8b163289014cf19311fc
writing layer sha256:1fa69e2371b762d1882b0bd98d284f312a36c27add732016e12e52586f98a9f5
writing layer sha256:3ab8c1bbd3cd85e1b39b09f5ff9a76e64da20ef81c22ec0937cc2e7076f1a81c
writing layer sha256:d86595b443c06710a3e5ba27700c6a93ded80100ff1aa808a7f3444ff529fa70
writing manifest
success

% ollama list
NAME    	ID          	SIZE  	MODIFIED
eeve:q4 	68f4c2c2d9fe	6.5 GB	7 days ago
eeve:q5 	0732d4a47219	7.7 GB	7 minutes ago
gemma:2b	b50d6c999e59	1.7 GB	10 days ago

ollama run eeve:q5

% ollama run eeve:q5
>>> 대한민국의 수도는 어디야?
안녕하세요! 대한민국의 수도에 대해 궁금해하시는군요. 서울이 바로 그 곳입니다! 서울은 나라의 북부에 위치해 있으며 정치, 경제, 문화의 중심지 역할을 하고 있습니다. 2019년 기준으로 약 970만 명의 인구를 가진 대도시로,
세계에서 가장 큰 도시 중 하나입니다. 또한 세계적인 금융 허브이자 주요 관광지로, 경복궁, 남산타워, 명동과 같은 다양한 역사적 및 현대적 명소를 자랑하고 있습니다. 서울은 활기찬 밤문화로도 유명하며, 많은 바와 클럽
관광객과 현지인 모두를 끌어들입니다. 대한민국의 수도에 대해 더 알고 싶으신 것이 있으신가요?

>>>

아래 문구로 질문해보겠습니다.

한국의 수도는 어디인가요? 아래 선택지 중 골라주세요.\n\n(A) 경성\n(B) 부산\n(C) 평양\n(D) 서울\n(E) 전주

>>> 한국의 수도는 어디인가요? 아래 선택지 중 골라주세요.\n\n(A) 경성\n(B) 부산\n(C) 평양\n(D) 서울\n(E) 전주
대한민국의 수도에 대한 질문에 답변해 주셔서 감사합니다! 정답은 (D) 서울입니다. 서울은 나라의 북부에 위치해 있으며 정치, 경제, 문화의 중심지 역할을 하고 있습니다. 2019년 기준으로 약 970만 명의 인구를
대도시로, 세계에서 가장 큰 도시 중 하나입니다. 또한 세계적인 금융 허브이자 주요 관광지로, 경복궁, 남산타워, 명동과 같은 다양한 역사적 및 현대적 명소를 자랑하고 있습니다. 서울은 활기찬 밤문화로도 유명하며,
많은 바와 클럽이 관광객과 현지인 모두를 끌어들입니다. 대한민국의 수도에 대해 더 알고 싶으신 것이 있으신가요?

질문과 동시에 답변이 나오는것처럼 작동합니다. 속도도 좋고 답변의 퀄리티도 좋습니다.

>>> 다음 지문을 읽고 문제에 답하시오.
...
... ---
...
... 1950년 7월, 한국 전쟁 초기에 이승만 대통령은 맥아더 장군에게 유격대원들을 북한군의 후방에 침투시키는 방안을 제안했다. 이후, 육군본부는 육본직할 유격대와 육본 독립 유격대를 편성했다. 국군은 포항과 인접한 장사동 지역에 상륙작
... 전을 수행할 부대로 독립 제1유격대대를 선정했다. 육군본부는 독립 제1유격대대에 동해안의 장사동 해안에 상륙작전을 감행하여 북한군 제2군단의 보급로를 차단하고 국군 제1군단의 작전을 유리하게 하기 위한 작전명령(육본 작명 제174호)
... 을 하달했다. 9월 14일, 독립 제1유격대대는 부산에서 LST 문산호에 승선하여 영덕군의 장사동으로 출항했다.
...
... 1950년 9월 15일, 독립 제1유격대대는 장사동 해안에 상륙을 시도하였으나 태풍 케지아로 인한 높은 파도와 안개로 인해 어려움을 겪었다. LST 문산호는 북한군의 사격과 파도로 인해 좌초되었고, 상륙부대는 09:00시경에 전원이
... 상륙을 완료하였다. 그 후, 15:00시경에 200고지를 점령하였고, 다양한 무기와 장비를 노획하였다. 9월 16일과 17일에는 독립 제1유격대대가 여러 위치에서 북한군과의 전투를 벌였으며, 미 구축함과의 연락 두절로 인해 추가적인
... 어려움을 겪었다.
...
... 장사동에서 위급한 상황에 처한 독립 제1유격대대를 구출하기 위해 해군본부는 LT-1(인왕호)를 급파했으나, LST 문산호의 구출에 실패했다. 해군본부는 상륙부대의 철수를 지원하기 위해 LST 조치원호를 현지로 보냈다. 9월 18일,
... 이명흠 부대장은 유엔 해군과의 협력 하에 부족한 식량과 탄약 지원을 받았다. 9월 19일, 유엔군의 함포지원과 함께 LST 조치원호가 도착하여 철수 작전을 시작했다. 스피어 소령은 직접 해안에 상륙하여 구조작전을 지시하였다. 9월 2
... 0일, 725명이 부산항으로 복귀했으나, 32명이 장사동 해안에 남아 북한군의 포로가 되었거나 탈출하여 국군에 합류하였다.
...
... 장사리 전투가 인천 상륙작전의 양동작전으로 알려졌으나, 이 전투가 드라마틱한 요소로 인해 과장되었으며, 실제로는 인천 상륙작전과 큰 관련이 없다. 또한, 북한이나 중국의 군사적 상황을 고려할 때, 장사리에서의 전투가 낙동강 전선에 영
... 향을 끼칠 가능성은 낮다.
...
... ---
...
... 문제
... 1. 지문에 나오는 지명을 모두 쓰시오.
... 2. 그중 대게로 유명한 곳은?
지문에 나오는 지명은 다음과 같습니다:
- 포항
- 장사동
- 영덕군
- 부산
- 문산호
- 조치원호
- 스피어 소령
- 낙동강 전선
대게로 유명한 곳은 영덕군입니다.

ollama 쉘에서 나올때는 Use Ctrl + d or /bye to exit.

2021년형 14인치 MacBook Pro의 M1 Pro CPU는 10코어이고 GPU는 16코어입니다.

2020년형 13인치 MacBook Pro의 M1 CPU는 8코어이고 GPU는 8코어입니다. (저는 이겁니다)

cpu를 100% 까지 사용하면서 일했습니다. (수고했어)

LangServe로 모델 서빙

langserve_ollama % ll app
total 40
drwxr-xr-x@  8 dongsik  staff   256 Apr 19 16:35 .
drwxr-xr-x@ 12 dongsik  staff   384 Apr 20 09:22 ..
-rw-r--r--@  1 dongsik  staff     0 Apr 19 16:35 __init__.py
-rw-r--r--@  1 dongsik  staff   549 Apr 19 16:35 chain.py
-rw-r--r--@  1 dongsik  staff   723 Apr 19 16:35 chat.py
-rw-r--r--@  1 dongsik  staff   328 Apr 19 16:35 llm.py
-rw-r--r--@  1 dongsik  staff  1444 Apr 19 16:35 server.py
-rw-r--r--@  1 dongsik  staff   559 Apr 19 16:35 translator.py
(llm311) dongsik@dongsikleeui-MacBookPro langserve_ollama %

chat.py, chain.py, llm.py, translator.py 세개 파일의 llm 모델명을 내 환경에 맞게 수정합니다.

# LangChain이 지원하는 다른 채팅 모델을 사용합니다. 여기서는 Ollama를 사용합니다.
#llm = ChatOllama(model="EEVE-Korean-10.8B:latest")
llm = ChatOllama(model="eeve:q5")

server.py 실행

(llm311) dongsik@dongsikleeui-MacBookPro langserve_ollama % cd app
(llm311) dongsik@dongsikleeui-MacBookPro app % pwd
/Users/dongsik/GitHub/teddylee777/langserve_ollama/app
(llm311) dongsik@dongsikleeui-MacBookPro app % ll
total 40
drwxr-xr-x@  8 dongsik  staff   256 Apr 19 16:35 .
drwxr-xr-x@ 12 dongsik  staff   384 Apr 20 09:22 ..
-rw-r--r--@  1 dongsik  staff     0 Apr 19 16:35 __init__.py
-rw-r--r--@  1 dongsik  staff   584 Apr 20 13:15 chain.py
-rw-r--r--@  1 dongsik  staff   758 Apr 20 13:15 chat.py
-rw-r--r--@  1 dongsik  staff   363 Apr 20 13:15 llm.py
-rw-r--r--@  1 dongsik  staff  1444 Apr 19 16:35 server.py
-rw-r--r--@  1 dongsik  staff   594 Apr 20 13:15 translator.py
(llm311) dongsik@dongsikleeui-MacBookPro app % python server.py

http://0.0.0.0:8000/prompt/playground/

질문 과 답변

RemoteRunable로 LangServe를 호출 하도록 변경

<경로>/langserve_ollama/example

% ll
total 120
drwxr-xr-x@  9 dongsik  staff    288 Apr 20 13:50 .
drwxr-xr-x@ 12 dongsik  staff    384 Apr 20 09:22 ..
drwxr-xr-x@  3 dongsik  staff     96 Apr 19 16:35 .streamlit
-rw-r--r--@  1 dongsik  staff  12504 Apr 19 16:35 00-ollama-test.ipynb
-rw-r--r--@  1 dongsik  staff   4885 Apr 19 16:35 01-remote-invoke.ipynb
-rw-r--r--@  1 dongsik  staff   3775 Apr 19 16:35 02-more-examples.ipynb
-rw-r--r--@  1 dongsik  staff   6222 Apr 19 16:35 main.py
-rw-r--r--@  1 dongsik  staff  14708 Apr 19 16:35 requirements.txt

01-remote-invoke.ipynb의 로컬 LangServe 주소로 변경합니다

from langserve import RemoteRunnable

# ngrok remote 주소 설정

#chain = RemoteRunnable("NGROK 에서 설정한 본인의 도메인 주소/prompt/")
# chain = RemoteRunnable("https://poodle-deep-marmot.ngrok-free.app/prompt/")
chain = RemoteRunnable("http://0.0.0.0:8000/prompt/")

for token in chain.stream({"topic": "딥러닝에 대해서 알려줘"}):
    print(token, end="")

ngrok을 이용해서 로컬 LangServe 를 Port Forwarding하기

ngrok 가입

https://dashboard.ngrok.com/cloud-edge/domains

M1용 설치 파일을 다운로드 받아서 설치합니다.

https://dashboard.ngrok.com/get-started/setup/macos

무료 도메인 설정

humble-curiously-antelope.ngrok-free.app

LangServe 구동된 포트로 ngok 도메인 지정해서 포트 포워딩

ngrok http --domain=humble-curiously-antelope.ngrok-free.app 8000

% ngrok http --domain=humble-curiously-antelope.ngrok-free.app 8000

ngrok                                                                                                                    (Ctrl+C to quit)

K8s Gateway API support available now: https://ngrok.com/r/k8sgb

Session Status                online
Account                       dongsik.lee (Plan: Free)
Version                       3.8.0
Region                        Japan (jp)
Latency                       45ms
Web Interface                 http://127.0.0.1:4040
Forwarding                    https://humble-curiously-antelope.ngrok-free.app -> http://localhost:8000

Connections                   ttl     opn     rt1     rt5     p50     p90
                              0       0       0.00    0.00    0.00    0.00

https://humble-curiously-antelope.ngrok-free.app/prompt/playground/

ngrok url로 질의를 해보면 local 서버의 GPU가 100%로 올라가면서 Output을 만들고있습니다.

01-remote-invoke.ipynb 파일의 RemoteRunnable 주소를 ngrok 주소로 변경하고 vscode로 실행해봅니다.

from langserve import RemoteRunnable

# ngrok remote 주소 설정

#chain = RemoteRunnable("NGROK 에서 설정한 본인의 도메인 주소/prompt/")
chain = RemoteRunnable("https://humble-curiously-antelope.ngrok-free.app/prompt/")
#chain = RemoteRunnable("http://0.0.0.0:8000/prompt/")

for token in chain.stream({"topic": "딥러닝에 대해서 알려줘"}):
    print(token, end="")

잘 작동됩니다.

추가 예제

번역기

from langchain_community.chat_models import ChatOllama
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate

# LangChain이 지원하는 다른 채팅 모델을 사용합니다. 여기서는 Ollama를 사용합니다.
# llm = ChatOllama(model="EEVE-Korean-10.8B:latest")
llm = ChatOllama(model="eeve:q5")

# 프롬프트 설정
prompt = ChatPromptTemplate.from_template(
    "Translate following sentences into Korean:\n{input}"
)

# LangChain 표현식 언어 체인 구문을 사용합니다.
chain = prompt | llm | StrOutputParser()

LLM을 Runable로 실행

from langchain_community.chat_models import ChatOllama
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate

# LangChain이 지원하는 다른 채팅 모델을 사용합니다. 여기서는 Ollama를 사용합니다.
# llm = ChatOllama(model="EEVE-Korean-10.8B:latest")
llm = ChatOllama(model="eeve:q5")

Streamlit으로 PDF rag 해보기

Embedding을 OpenAIEmbeddings을 사용하기위해서 OPENAI_API_KEY를 .env 파일에서 가져옵니다.

% pip install python-dotenv

main.py 내용중 OPEN API KEY세팅과 LANGSERVE_ENDPOINT를 ngrok주소로 업데이트 한후 실행합니다

% streamlit run main.py

  You can now view your Streamlit app in your browser.

  Local URL: http://localhost:8501
  Network URL: http://192.168.0.10:8501

  For better performance, install the Watchdog module:

  $ xcode-select --install
  $ pip install watchdog

예제 > SPRI_AI_Brief_2023년12월호_F.pdf

https://spri.kr/posts?code=AI-Brief

FileNotFoundError: [Errno 2] No such file or directory: 'pdfinfo'

% conda install poppler
Channels:
 - defaults
 - conda-forge
Platform: osx-arm64

% pip install pdftotext

FileNotFoundError: [Errno 2] No such file or directory: 'tesseract'

% brew install tesseract
==> Auto-updating Homebrew...
Adjust how often this is run with HOMEBREW_AUTO_UPDATE_SECS or disable with
HOMEBREW_NO_AUTO_UPDATE. Hide these hints with HOMEBREW_NO_ENV_HINTS (see `man brew`).

% brew install tesseract-lang

UnicodeEncodeError: 'ascii' codec can't encode characters in position 22-23: ordinal not in range(128)

위 PDF에서 최종 질문 을 해보겠습니다.

실제해보니 내용이 엄청난 영상입니다.

- Ollama

- EEVE 양자화 모델

- LangServe

- ngrok

- Streamlit RAG

- Asitop

감사합니다.

Asitop으로 내 M1 상태 모니터링

% pip install asitop

% sudo asitop

sudo 패스워드 입력

728x90

'AI > LLM' 카테고리의 다른 글

M1 Ollama로 Model 테스트 (0)	2024.04.25
M1 Ollama로 Llama3 8B 모델 테스트 (0)	2024.04.24
M1 llama.cpp로 EEVE Korean Instruct GGUF 모델 실행 (0)	2024.04.14
M1 Ollama로 경량화 한글모델 eeve GGUF 모델 설치 및 LangChain (0)	2024.04.12
M1 Ollama+Gemma+LangChain 설치하고 간단히 돌려보기 (0)	2024.04.10

M1 Ollama로 경량화 한글모델 eeve GGUF 모델 설치 및 LangChain

a_mnesia 2024. 4. 12. 14:08

2024. 4. 12. 14:08

728x90

앞에서 Ollama로 Gemma 경량화 모델을 실행해봤습니다. 이번엔 한글을 지원하는 경량화 모델중에

https://huggingface.co/heegyu/EEVE-Korean-Instruct-10.8B-v1.0-GGUF

heegyu/EEVE-Korean-Instruct-10.8B-v1.0-GGUF · Hugging Face

Usage requirements # GPU model CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python --force-reinstall --upgrade --no-cache-dir --verbose # CPU CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python --force-reinstall

huggingface.co

사이트에 설치 방법과 테스트 방법이 잘 기술되어있습니다. 다만, GPU Cuda 버전이 안맞을경우 가이드대로 실행할때 Execption이 발생해서

Original model is yanolja/EEVE-Korean-Instruct-10.8B-v1.0
quantized using llama.cpp

llama.cpp는 4비트 정수 양자화를 이용해서 Llama 모델과 Python이 함께 실행하는(저수준 액세스 바인더) 것을 목표로 만들어진 프로젝트입니다. 의존성 없는 순수 C/C++를 통해서 구현되었으며, Mac OS, Windows, Linux 모두 실행 가능합니다.

다운로드 가능한 세가지 모델의 비교입니다

GGUF	ggml-model-Q4_K_M.gguf	ggml-model-Q5_K_M.gguf	ggml-model-f16.gguf
Size	6.51 GB	7.65 GB	21.6 GB
Metadata	Value	Value	Value
version	3	3	3
tensor_count	435	435	435
kv_count	24	24	23
general.architecture	llama	llama	llama
general.name	LLaMA v2	LLaMA v2	LLaMA v2
general.file_type	15	17	1
general.quantization_version	2	2	4096
llama.context_length	4096	4096	4096
llama.embedding_length	4096	4096	48
llama.block_count	48	48	14336
llama.feed_forward_length	14336	14336	128
llama.rope.dimension_count	128	128	10000
llama.rope.freq_base	10000	10000	32
llama.attention.head_count	32	32	8
llama.attention.head_count_kv	8	8	1E-05
llama.attention.layer_norm_rms_epsilon	1E-05	1E-05	-
tokenizer.ggml.model	llama	llama	llama
tokenizer.ggml.tokens	[<unk>, <s>, </s>, <0x00>, <0x01>, ...]	[<unk>, <s>, </s>, <0x00>, <0x01>, ...]	[<unk>, <s>, </s>, <0x00>, <0x01>, ...]
tokenizer.ggml.scores	[-1000, -1000, -1000, -1000, -1000, ...]	[-1000, -1000, -1000, -1000, -1000, ...]	[-1000, -1000, -1000, -1000, -1000, ...]
tokenizer.ggml.token_type	[3, 3, 3, 6, 6, ...]	[3, 3, 3, 6, 6, ...]	[3, 3, 3, 6, 6, ...]
tokenizer.ggml.bos_token_id	1	1	1
tokenizer.ggml.eos_token_id	32000	32000	32000
tokenizer.ggml.unknown_token_id	0	0	0
tokenizer.ggml.padding_token_id	2	2	2
tokenizer.ggml.add_bos_token	TRUE	TRUE	TRUE
tokenizer.ggml.add_eos_token	FALSE	FALSE	FALSE
tokenizer.chat_template	{% if messages[0]['role'] == 'system' %}{% set loop_messages = messages[1:] %}{% set system_message = messages[0]['content'] %}{% else %}{% set loop_messages = messages %}{% set system_message = 'You are a helpful assistant.' %}{% endif %}{% if not add_generation_prompt is defined %}{% set add_generation_prompt = false %}{% endif %}{% for message in loop_messages %}{% if loop.index0 == 0 %}{{'<\|im_start\|>system ' + system_message + '<\|im_end\|> '}}{% endif %}{{'<\|im_start\|>' + message['role'] + ' ' + message['content'] + '<\|im_end\|>' + ' '}}{% endfor %}{% if add_generation_prompt %}{{ '<\|im_start\|>assistant ' }}{% endif %}	{% if messages[0]['role'] == 'system' %}{% set loop_messages = messages[1:] %}{% set system_message = messages[0]['content'] %}{% else %}{% set loop_messages = messages %}{% set system_message = 'You are a helpful assistant.' %}{% endif %}{% if not add_generation_prompt is defined %}{% set add_generation_prompt = false %}{% endif %}{% for message in loop_messages %}{% if loop.index0 == 0 %}{{'<\|im_start\|>system ' + system_message + '<\|im_end\|> '}}{% endif %}{{'<\|im_start\|>' + message['role'] + ' ' + message['content'] + '<\|im_end\|>' + ' '}}{% endfor %}{% if add_generation_prompt %}{{ '<\|im_start\|>assistant ' }}{% endif %}	{% if messages[0]['role'] == 'system' %}{% set loop_messages = messages[1:] %}{% set system_message = messages[0]['content'] %}{% else %}{% set loop_messages = messages %}{% set system_message = 'You are a helpful assistant.' %}{% endif %}{% if not add_generation_prompt is defined %}{% set add_generation_prompt = false %}{% endif %}{% for message in loop_messages %}{% if loop.index0 == 0 %}{{'<\|im_start\|>system ' + system_message + '<\|im_end\|> '}}{% endif %}{{'<\|im_start\|>' + message['role'] + ' ' + message['content'] + '<\|im_end\|>' + ' '}}{% endfor %}{% if add_generation_prompt %}{{ '<\|im_start\|>assistant ' }}{% endif %}

셋중에서 가장작은 Q4 모델을 다운로드 받습니다.

https://huggingface.co/heegyu/EEVE-Korean-Instruct-10.8B-v1.0-GGUF/resolve/main/ggml-model-Q4_K_M.gguf?download=true

그리고 modelfile 파일을 만들어서 다운로드 받은 모델을 Ollama에 등록해주면 됩니다.

다운로드 받는 GGML은 Apple M1 및 M2 실리콘에 최적화된 양자화 구현이라고 합니다.

다운로드 받은 모델파일을 Ollama에 등록해주기 위해서 Modelfile을 생성합니다.

ModelFile

FROM /Users/dongsik/GitHub/llm/eeve/EEVE-Korean-Instruct-10.8B-v1.0-GGUF/ggml-model-Q4_K_M.gguf

TEMPLATE """### User:
{{ .Prompt }}

### Assistant:
"""

PARAMETER temperature 0.1

PARAMETER num_ctx 4096
PARAMETER stop "</s>"
PARAMETER stop "### System:"
PARAMETER stop "### User:"
PARAMETER stop "### Assistant:"

Modelfile과 다운로드받은 GGUF 파일이 준비되었습니다.

% ll
total 40
drwxr-xr-x  8 dongsik  staff   256 Apr 12 14:28 .
drwxr-xr-x  7 dongsik  staff   224 Apr 12 11:33 ..
drwxr-xr-x  5 dongsik  staff   160 Apr 11 00:11 .ipynb_checkpoints
drwxr-xr-x  3 dongsik  staff    96 Apr 10 23:55 EEVE-Korean-Instruct-10.8B-v1.0-GGUF
-rw-r--r--  1 dongsik  staff   325 Apr 12 14:27 Modelfile
-rw-r--r--  1 dongsik  staff  5957 Apr 11 00:26 ollama_eeve_gguf.ipynb

eeve 모델을 Ollama에 등록해줍니다

% ollama create eeve:q4 -f Modelfile
2024/04/12 14:28:50 parser.go:73: WARN Unknown command:
2024/04/12 14:28:50 parser.go:73: WARN Unknown command:
2024/04/12 14:28:50 parser.go:73: WARN Unknown command:
transferring model data
creating model layer
creating template layer
creating parameters layer
creating config layer
using already created layer sha256:5a79b80eb5e2eec5cf5d514dfa32187872dde1dae6a2b9c8
using already created layer sha256:c3de887d2d041bfea1bfed395834ea828839af278003269e
using already created layer sha256:e6b785eab1777ecfc57eab9a85f9b623931e6f1079ae6d75
using already created layer sha256:8b03799cdb5862e5cdfda70f0e116193aa07f2309015a158
writing manifest
success

성공적으로 등록되면 모델을 확인가능합니다.

gemma:2b 모델과 eeve:q4 모델 두개가 등록된것을 확인합니다.

% ollama list
NAME       	ID          	SIZE  	MODIFIED
eeve:q4    	68f4c2c2d9fe	6.5 GB	8 seconds ago
gemma:2b   	b50d6c999e59	1.7 GB	2 days ago

지울때는 rm 명령을 사용합니다 (ollama rm eeve:q4)

cli로 ollama eeve:q4 모델에 질문을 해봅니다.

M1 노트북임을 감안했을때 상당히 빠르게 답변을 해주고 있고 퀄리티도 상당합니다.

아래 추가로 ChatGPT 3.5와 뤼튼의 GPT 4를 통해서 동일한 질문을 해본 결과 입니다.

eeve:q4 의 답변

% ollama run eeve:q4 하늘은 왜 파란색일까요?
하늘이 파란색인 이유는 대기 중에 있는 공기 분자와 먼지 입자들이 햇빛을 산란시키기 때문입니다. 태양빛이 지구 대기를 통과할 때, 이 작은 입자들과 상호 작용하여 빛을 다양한 방향으로 흩어지게 합니다. 우리가 하늘에서 보는 빛은 주로
파란색과 보라색의 가시광선 스펙트럼에 집중되어 있습니다.

이 현상은 빛이 공기 분자와 먼지 입자에 부딪힐 때 일어나는 산란 효과 때문입니다. 이 작은 입자들은 태양빛을 다양한 방향으로 흩어지게 하여, 우리가 하늘에서 보는 색깔을 만들어냅니다. 파란색과 보라색 빛은 다른 색들보다 더
산란되기 때문에, 우리 눈에 가장 많이 도달하여 하늘을 파랗게 보이게 합니다.

또한 대기 중에 있는 물방울이나 얼음 결정도 햇빛의 산란에 역할을 할 수 있습니다. 이 입자들이 태양빛과 상호 작용할 때, 빛을 다양한 방향으로 흩어지게 하여 하늘을 더욱 파랗게 보이게 만듭니다.

하늘이 항상 파란색인 것은 아니며, 대기 조건에 따라 색깔이 변할 수 있다는 점을 주목하는 것이 중요합니다. 예를 들어, 일출이나 일몰 때는 대기 중에 있는 먼지 입자와 물방울의 양이 증가하여 햇빛을 더 많이 산란시켜 하늘에 주황색,
분홍색 또는 보라색을 만들 수 있습니다.

요약하자면, 하늘이 파란색인 이유는 공기 분자, 먼지 입자 및 대기 중 물방울이나 얼음 결정과 같은 다양한 요소가 태양빛의 산란에 기여하기 때문입니다. 이 상호 작용은 파란색과 보라색 빛의 높은 농도를 만들어내어 우리가 하늘에서 보는
색깔을 만들어냅니다.

GPT-3.5의 답변

뤼튼 GPT-4의 답변

Ollama Langchain을 이용해서 추론을 잘하는지 질문을 해보겠습니다.

import time
import langchain
from langchain_community.llms import Ollama
import logging

# Configure basic logging
logging.basicConfig(level=logging.INFO)

try:
    llm = Ollama(model="eeve:q4")
    
    # 프롬프트가 잘 정의되어 있는지 확인하는 것이 필요합니다. (모델의 기능에 따라 조정가능)
    start = time.time()
    prompt = ("한국의 수도는 어디인가요? 아래 선택지 중 골라주세요.\n\n(A) 경성\n(B) 부산\n(C) 평양\n(D) 서울\n(E) 전주")
    
    response = llm.invoke(prompt)
    print(response)
    print(time.time() - start)
    
except ImportError:
    logging.error("Failed to import Ollama from langchain_community. Is the package installed?")
except Exception as e:
    logging.error(f"An unexpected error occurred: {e}")

정답은 (D) 서울입니다.
3.465108871459961

728x90

'AI > LLM' 카테고리의 다른 글

M1 Ollama로 heegyu/EEVE-Korean-Instruct-10.8B-v1.0-GGUF 양자화 모델 테스트 (1)	2024.04.20
M1 llama.cpp로 EEVE Korean Instruct GGUF 모델 실행 (0)	2024.04.14
M1 Ollama+Gemma+LangChain 설치하고 간단히 돌려보기 (0)	2024.04.10
OPENAI_API_KEY 키 설정하기 (0)	2024.03.18
LLM은 언제쯤 내 띠를 맞출려나 (0)	2024.02.24

M1 Ollama+Gemma+LangChain 설치하고 간단히 돌려보기

a_mnesia 2024. 4. 10. 10:52

2024. 4. 10. 10:52

728x90

해당 내용은 Mac (M1) 을 기준으로 작성되었습니다.

대규모 언어 모델을 로컬에서 실행하기 위해서 Ollama를 설치하고 구글의 최첨단 경량 오픈모델인 gemma를 다운받아서 간단히 돌려보고

LangChain으로 연결해봅니다.

간밤에 "LLM RAG Langchain 통합" 채팅방의 권진영님께서 친절하게 설치와 사용방법을 알려주셔서 다른분들도 간단히 설치해서 사용해보면 좋을거 같아서 정리해봅니다.

Ollama github에 가면 로컬환경에 설치가능한 설치파일들을 다운받을수있습니다.

https://github.com/ollama/ollama

GitHub - ollama/ollama: Get up and running with Llama 2, Mistral, Gemma, and other large language models.

Get up and running with Llama 2, Mistral, Gemma, and other large language models. - ollama/ollama

github.com

이중 macOS에 해당하는 설치파일을 다운로드 받아서 설치합니다. 설치는 너무 간단해서 의외이기도 합니다. 다운로드 받으면 디렉토리에 설치파일이 생기게되고 더블클릭해서 설치합니다.

실행하면 아래처럼 설치가 시작됩니다. Next 클릭

설치는 금방 완료됩니다.

설치가 완료되면 Terminal을 실행해서 Ollama를 실행합니다.

Gemma는 두개 모델을 제공하는데 먼저 가장작은 모델로 시작해 보겠습니다.

% ollama run gemma:2b
pulling manifest
pulling c1864a5eb193...   5% ▕███                     ▏  87 MB/1.7 GB  8.0 MB/s   3m17s

설치가 완료되면 메시지를 호출 할수있는 창이 뜨면서 설치가 완료됩니다.

% ollama run gemma:2b
pulling manifest
pulling c1864a5eb193... 100% ▕██████████████████████████████████████████████▏ 1.7 GB
pulling 097a36493f71... 100% ▕██████████████████████████████████████████████▏ 8.4 KB
pulling 109037bec39c... 100% ▕██████████████████████████████████████████████▏  136 B
pulling 22a838ceb7fb... 100% ▕██████████████████████████████████████████████▏   84 B
pulling 887433b89a90... 100% ▕██████████████████████████████████████████████▏  483 B
verifying sha256 digest
writing manifest
removing any unused layers
success
>>> hi
Hi! 👋 How can I assist you today? 😊

Is there anything I can help you with?

>>> Send a message (/? for help)

간단하게 "hi"로 인사해 봤습니다.

이제 실행창에서 나오도록 합니다 "/bye"를 입력합니다.

>>> /bye
Hello! 👋 It's nice to hear from you. How can I help you today? 😊

Is there anything I can do for you?

>>> /bye
(base) dongsik@dongsikleeui-MacBookPro ~ %

바로 내보내주지는 않으면 한번더 "/bye" 합니다

설치되어있는 모델을 확인할수도 있습니다.

% ollama list
NAME    	ID          	SIZE  	MODIFIED
gemma:2b	b50d6c999e59	1.7 GB	6 minutes ago

또한 설치된 모델을 삭제도 할수있습니다. 설치와 삭제가 너무간단합니다.

% ollama rm gemma:2b
deleted 'gemma:2b'
% ollama list
NAME	ID	SIZE	MODIFIED
%

그럼 다시 설치하고 간단히 사용방법을 설명합니다.

1. LangChain 으로 실행하기

# LangChain 설치
pip install langchain

import langchain
# LangChain 버전 확인
print('LangChain version:', langchain.__version__)

결과
LangChain version: 0.1.12

로컬에 설치든 ollama gemma:2b 모델을 사용하도록 설정하고 실행합니다.

from langchain_community.llms import Ollama
import logging

# logging 설정
logging.basicConfig(level=logging.INFO)

try:
    llm = Ollama(model="gemma:2b")
    
    # 프롬프트가 잘 정의되어 있는지 확인하는 것이 필요합니다. (모델의 기능에 따라 조정가능)
    prompt = ("Why is the sky blue?")
    
    response = llm.invoke(prompt)
    print(response)
except ImportError:
    logging.error("Failed to import Ollama from langchain_community. Is the package installed?")
except Exception as e:
    logging.error(f"An unexpected error occurred: {e}")

결과 :
The sky appears blue due to Rayleigh scattering. This scattering process occurs when light interacts with molecules in the Earth's atmosphere. 

* **Blue light has a longer wavelength than other colors**. This means it can penetrate further into the atmosphere. 
* **Blue light waves have more energy** than other colors, so they are more likely to scatter. 
* **Water vapor molecules** in the atmosphere absorb blue light more efficiently than other colors. 
* **Scattered blue light** is scattered in all directions equally, giving the sky its blue color.

The amount and intensity of blue scattering depends on several factors, including:

* **Particle size and density of the particles**: Smaller particles scatter light more efficiently than larger particles. 
* **The wavelength of light**: Blue light is scattered more strongly than other colors. 
* **Atmospheric conditions**: Temperature, humidity, and air density can also affect scattering.

Overall, the scattering of sunlight in the atmosphere creates the blue color of the sky.

url call로 호출하고 결과를 streaming 방식으로 stand out으로 출력합니다.

from langchain.callbacks.manager import CallbackManager
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
from langchain_community.llms.ollama import Ollama

llm = Ollama(
    base_url="http://localhost:11434",
    model="gemma:2b",
    callback_manager=CallbackManager(
        [StreamingStdOutCallbackHandler()],
    ),
)

prompt = ("Why is the sky blue?")
response = llm.invoke(prompt)
print(response)

결과 :
The sky appears blue due to Rayleigh scattering. Rayleigh scattering is the scattering of light by particles of a shorter wavelength, such as blue light. This scattering causes longer wavelengths, such as red and yellow light, to be scattered more than blue light. As a result, the sky appears blue to us.The sky appears blue due to Rayleigh scattering. Rayleigh scattering is the scattering of light by particles of a shorter wavelength, such as blue light. This scattering causes longer wavelengths, such as red and yellow light, to be scattered more than blue light. As a result, the sky appears blue to us.

https://github.com/ollama/ollama/blob/main/docs/api.md

2. Command 창에서 curl로 /api/generate

streaming

Reqeust
% curl http://localhost:11434/api/generate -d '{
  "model": "gemma:2b",
  "prompt":"Why is the sky blue?"
}'

Response
{"model":"gemma:2b","created_at":"2024-04-10T01:46:35.254492Z","response":"The","done":false}
{"model":"gemma:2b","created_at":"2024-04-10T01:46:35.291573Z","response":" sky","done":false}
{"model":"gemma:2b","created_at":"2024-04-10T01:46:35.325664Z","response":" appears","done":false}
... <생략>
{"model":"gemma:2b","created_at":"2024-04-10T01:46:44.741546Z","response":" the","done":false}
{"model":"gemma:2b","created_at":"2024-04-10T01:46:44.775088Z","response":" atmosphere","done":false}
{"model":"gemma:2b","created_at":"2024-04-10T01:46:44.810226Z","response":".","done":false}
{"model":"gemma:2b","created_at":"2024-04-10T01:46:44.845784Z","response":"","done":true,"context":[106,1645,108,4385,603,573,8203,3868,235336,107,108,106,2516,108,651,8203,8149,3868,3402,577,153902,38497,235265,1417,38497,12702,1185,33365,113211,675,24582,575,573,10379,235303,235256,13795,235269,14076,24582,576,23584,578,16175,235265,109,235287,5231,5200,2611,919,476,5543,35571,688,1178,3868,2611,235265,1417,3454,674,1185,33365,30866,573,13795,235269,978,3868,2611,603,30390,3024,774,1167,2116,235265,108,235287,5231,10716,2611,919,476,25270,35571,688,578,603,30390,978,16347,1178,3118,2611,235265,108,235287,714,5231,10526,576,38497,688,12014,611,573,35571,576,573,2611,235265,11569,235269,3868,2611,603,30390,978,1178,1156,9276,235265,108,235287,714,13795,919,978,23584,24582,1178,16175,24582,235269,948,3454,674,978,3118,2611,603,30390,3024,235265,1417,603,3165,573,8203,8149,3868,235265,109,4858,708,1009,5942,4691,1105,573,3868,8203,235292,109,235287,714,3868,2881,603,5231,38131,576,5809,168428,1417,3454,674,573,8203,877,4824,3868,20853,576,1368,5342,689,7033,665,603,235265,108,235287,714,3868,2881,603,1170,5231,1665,10918,731,38636,168428,1417,3454,674,573,8203,877,4824,3868,793,4391,1368,1536,692,708,575,573,2134,235265,108,235287,714,3868,2881,603,476,5231,2667,576,2611,38497,168428,1417,3454,674,2611,603,30390,575,832,16759,731,24582,575,573,13795,235265,108,235287,714,3868,2881,576,573,8203,603,476,5231,28205,44299,168428,1417,603,1861,573,13795,603,780,13596,12876,235269,578,573,38497,2185,12014,611,573,6581,576,573,33365,8761,577,573,16071,575,573,13795,235265,107,108],"total_duration":13164771833,"load_duration":3454744833,"prompt_eval_count":15,"prompt_eval_duration":115430000,"eval_count":282,"eval_duration":9592944000}

No streaming

Request
% curl http://localhost:11434/api/generate -d '{
  "model": "gemma:2b",
  "prompt":"Why is the sky blue?",
  "stream": false
}'

Reponse
{"model":"gemma:2b","created_at":"2024-04-10T01:49:38.296228Z","response":"The sky is blue due to Rayleigh scattering. Rayleigh scattering is the scattering of light by particles of a shorter wavelength. This means that blue light has a greater wavelength and is scattered more than other colors. This is why the sky appears blue.","done":true,"context":[106,1645,108,4385,603,573,8203,3868,235336,107,108,106,2516,108,651,8203,603,3868,3402,577,153902,38497,235265,153902,38497,603,573,38497,576,2611,731,16071,576,476,25270,35571,235265,1417,3454,674,3868,2611,919,476,6561,35571,578,603,30390,978,1178,1156,9276,235265,1417,603,3165,573,8203,8149,3868,235265,107,108],"total_duration":1936533375,"load_duration":3180292,"prompt_eval_duration":272404000,"eval_count":49,"eval_duration":1658092000}

3. Command 창에서 curl로 /api/chat

Chat Request (Streaming)

Request
% curl http://localhost:11434/api/chat -d '{
  "model": "gemma:2b",
  "messages": [
    {
      "role": "user",
      "content": "why is the sky blue?"
    }
  ]
}'

Response
{"model":"gemma:2b","created_at":"2024-04-10T01:27:16.070998Z","message":{"role":"assistant","content":"The"},"done":false}
{"model":"gemma:2b","created_at":"2024-04-10T01:27:16.108371Z","message":{"role":"assistant","content":" sky"},"done":false}
{"model":"gemma:2b","created_at":"2024-04-10T01:27:16.142158Z","message":{"role":"assistant","content":" appears"},"done":false}
{"model":"gemma:2b","created_at":"2024-04-10T01:27:16.175229Z","message":{"role":"assistant","content":" blue"},"done":false}
{"model":"gemma:2b","created_at":"2024-04-10T01:27:16.207642Z","message":{"role":"assistant","content":" due"},"done":false}
... <생략>
{"model":"gemma:2b","created_at":"2024-04-10T01:27:25.624649Z","message":{"role":"assistant","content":" higher"},"done":false}
{"model":"gemma:2b","created_at":"2024-04-10T01:27:25.658043Z","message":{"role":"assistant","content":" temperatures"},"done":false}
{"model":"gemma:2b","created_at":"2024-04-10T01:27:25.692536Z","message":{"role":"assistant","content":"."},"done":false}
{"model":"gemma:2b","created_at":"2024-04-10T01:27:25.725932Z","message":{"role":"assistant","content":""},"done":true,"total_duration":12244362334,"load_duration":2484924542,"prompt_eval_count":15,"prompt_eval_duration":103298000,"eval_count":286,"eval_duration":9654690000}

Chat request (No streaming)

Request
% curl http://localhost:11434/api/chat -d '{
  "model": "gemma:2b",
  "messages": [
    {
      "role": "user",
      "content": "why is the sky blue?"
    }
  ],
  "stream": false
}'

Response
{"model":"gemma:2b","created_at":"2024-04-10T01:30:16.821415Z","message":{"role":"assistant","content":"The sky appears blue due to Rayleigh scattering. This phenomenon occurs when sunlight interacts with molecules in the Earth's atmosphere.\n\n**Rayleigh Scattering:**\n\n* Sunlight is composed of all colors of the spectrum, including blue, violet, yellow, orange, and red.\n* When sunlight enters the atmosphere, it interacts with molecules such as nitrogen and oxygen molecules.\n* These molecules have different sizes and structures, which cause different wavelengths of light to scatter in different directions.\n* Blue light, with its shorter wavelengths, is scattered more strongly than other colors due to its shorter path length through the atmosphere.\n\n**Blue Sky:**\n\n* As a result, blue light is scattered in all directions from the Sun.\n* This scattering effect spreads out the Sun's light throughout the atmosphere, making the sky appear blue.\n* The intensity of blue light can vary slightly depending on factors such as altitude, temperature, and atmospheric conditions.\n\n**Other Factors:**\n\n* The scattering process depends on the size and density of the molecules, which is why the sky appears blue even though the Sun is a star of much greater temperature.\n* The atmosphere is composed of different gases with varying densities, which influences the scattering process.\n* Cloud and pollution can also affect the sky's color, with clouds reflecting blue light more efficiently than other colors.\n\n**Conclusion:**\n\nThe blue color of the sky is primarily caused by Rayleigh scattering of sunlight by molecules in the Earth's atmosphere. This scattering process spreads out the Sun's light throughout the sky, making it appear blue to us on Earth."},"done":true,"total_duration":11417865583,"load_duration":4883667,"prompt_eval_duration":270041000,"eval_count":324,"eval_duration":11140497000}

728x90

'AI > LLM' 카테고리의 다른 글

M1 llama.cpp로 EEVE Korean Instruct GGUF 모델 실행 (0)	2024.04.14
M1 Ollama로 경량화 한글모델 eeve GGUF 모델 설치 및 LangChain (0)	2024.04.12
OPENAI_API_KEY 키 설정하기 (0)	2024.03.18
LLM은 언제쯤 내 띠를 맞출려나 (0)	2024.02.24
ChatGPT OpenAI와 AskUp 질문 패턴 및 응답 비교 (나에게 더 뭐가 위안이 될려나...) (0)	2023.03.17

PREV 이전 1 NEXT 다음

ollama

'AI > LLM-Gemma' 카테고리의 다른 글

목차

개요

eKYC란?

주요 특징

eKYC의 이점

eKYC의 예시

한계 및 과제

eKYC의 사용

경량 LLM 멀티모달 모델

Gemma 3

Granite 3.2 Vision

비교

DeepFace

주요 특징

주요 기능

용도

한계

로컬 터널링 서비스 Localtunnel

Localtunnel의 용도

Localtunnel의 사용법

Localtunnel과 유사한 도구로는 ngrok이 있습니다

Demo 구현

'AI > Multimodal' 카테고리의 다른 글

Google Colab에서 Ollama로 Gemma 실행하기

Ollama 설치

Ollama 시작 (실행)

추론

'AI > LLM-Gemma' 카테고리의 다른 글

Ollama + Gemma2 (9B)

Ollama + Gemma 1.1 (2B)

Ollama + Gemma 1.1 (7B)

'AI > LLM-Gemma' 카테고리의 다른 글

conda 가상환경 만들고 requirements.txt로 필요한 모듈 설치

Huggingface에서 모델 Download 받고 Ollama에 EEVE Q5 모델 등록하고 구동

LangServe로 모델 서빙

RemoteRunable로 LangServe를 호출 하도록 변경

ngrok을 이용해서 로컬 LangServe 를 Port Forwarding하기

추가 예제

Streamlit으로 PDF rag 해보기

Asitop으로 내 M1 상태 모니터링

'AI > LLM' 카테고리의 다른 글

eeve:q4 의 답변

GPT-3.5의 답변

뤼튼 GPT-4의 답변

'AI > LLM' 카테고리의 다른 글

1. LangChain 으로 실행하기

2. Command 창에서 curl로 /api/generate

3. Command 창에서 curl로 /api/chat

'AI > LLM' 카테고리의 다른 글

티스토리툴바