colab – Max的程式語言筆記

上傳多個檔案到 google drive 打包成一個檔案的解法有那些?

max-stackoverflow — Sun, 02 Mar 2025 07:52:39 +0000

由於逐一上傳 .png 和 .txt 檔案到 Google Drive 會很耗時，我建議 先在本地電腦打包成單一檔案，然後上傳到 Google Drive，接著在 Google Colab 內解壓縮。這樣可以大幅提升上傳速度，減少處理時間。

可行的打包方法

以下是幾種可行的解決方案，並附上適用場景和推薦程度：

方法	檔案格式	壓縮比	上傳速度
1⃣ ZIP 壓縮	`.zip`	中等	快
2⃣ TAR.GZ 壓縮	`.tar.gz`	高	快
3⃣ 7z 壓縮	`.7z`	最高	較慢
4⃣ Google Drive 直接同步	無壓縮	無	最慢

推薦方案：使用 ZIP 壓縮

優勢：

兼容性高（Windows、Mac、Linux 都支援）
解壓縮快（比 .tar.gz 和 .7z 更快）
上傳時間大幅縮短

步驟

(1) 在本機電腦打包

Windows 用戶：

Compress-Archive -Path instance_images -DestinationPath zenmaru_dataset.zip

Mac / Linux 用戶：

zip -r zenmaru_dataset.zip instance_images/

(2) 上傳 ZIP 到 Google Drive

手動上傳到 /content/drive/MyDrive/AI/datasets/ 目錄下。

(3) 在 Google Colab 解壓

import os
from google.colab import drive

# 掛載 Google Drive
drive.mount('/content/drive')

# 確保目錄存在
os.makedirs("/content/drive/MyDrive/AI/datasets/zenmaru_dataset", exist_ok=True)

# 解壓 ZIP 檔案
!unzip -o "/content/drive/MyDrive/AI/datasets/zenmaru_dataset.zip" -d "/content/drive/MyDrive/AI/datasets/zenmaru_dataset/"

其他方案比較

方案 2⃣：TAR.GZ 壓縮（適合超大數據集）

適合 1GB 以上數據，壓縮率比 ZIP 高。

tar -czvf zenmaru_dataset.tar.gz instance_images/

Colab 解壓：

tar -xzvf "/content/drive/MyDrive/AI/datasets/zenmaru_dataset.tar.gz" -C "/content/drive/MyDrive/AI/datasets/zenmaru_dataset/"

方案 3⃣：7z 壓縮（最高壓縮率，但上傳慢）

適合 超過 10GB 的數據集。

7z a zenmaru_dataset.7z instance_images/

Colab 解壓：

!7z x "/content/drive/MyDrive/AI/datasets/zenmaru_dataset.7z" -o"/content/drive/MyDrive/AI/datasets/zenmaru_dataset/"

總結

1⃣ 一般情況 ➝ ZIP（推薦），兼容性好，解壓縮快。
2⃣ 大數據集（>1GB） ➝ TAR.GZ，壓縮率較高，解壓速度適中。
3⃣ 超大數據集（>10GB） ➝ 7z，壓縮率最高，但 Colab 解壓較慢。

如果您的數據集小於 10GB，推薦使用 ZIP，它在 Google Colab 上解壓速度最快！

Stable Diffusion 1.5 + ControlNet 學習 nato sans 與 zen maru gothic

max-stackoverflow — Sat, 01 Mar 2025 13:32:05 +0000

完整的步驟來在 Google Colab 上訓練 Stable Diffusion 1.5 + ControlNet (control_v11p_sd15_canny)，以學習 Nato Sans 和 Zen Maru Gothic 字型風格。

環境準備

1⃣ 掛載 Google Drive

在 Colab 中執行以下程式碼，確保能夠存取你的資料集和輸出目錄：

from google.colab import drive
drive.mount('/content/drive')

確認 zenmaru_dataset 目錄已經存在：

!ls "/content/drive/MyDrive/AI/datasets/zenmaru_dataset"

安裝相依套件

2⃣ 安裝 `diffusers`、`transformers`、`accelerate`、`controlnet`

!pip install diffusers transformers accelerate safetensors torchvision
!pip install opencv-python

下載 Stable Diffusion 1.5 與 ControlNet

3⃣ 下載 `stable-diffusion-v1-5` 模型

import torch
from diffusers import StableDiffusionPipeline

MODEL_PATH = "runwayml/stable-diffusion-v1-5"
pipe = StableDiffusionPipeline.from_pretrained(MODEL_PATH, torch_dtype=torch.float16).to("cuda")

4⃣ 下載 ControlNet (`control_v11p_sd15_canny`)

from diffusers import ControlNetModel

CONTROLNET_PATH = "lllyasviel/control_v11p_sd15_canny"
controlnet = ControlNetModel.from_pretrained(CONTROLNET_PATH, torch_dtype=torch.float16).to("cuda")

資料預處理

5⃣ 載入並處理 Zenmaru Dataset

import os
import torch
import cv2
import numpy as np
from PIL import Image
from torchvision import transforms

DATASET_DIR = "/content/drive/MyDrive/AI/datasets/zenmaru_dataset"
OUTPUT_DIR = "/content/drive/MyDrive/AI/output/zenmaru_lora"

# 讀取圖片並轉換成 Canny 邊緣圖
def preprocess_images(data_path, resolution=512):
    image_files = [os.path.join(data_path, f) for f in os.listdir(data_path) if f.endswith(('.png', '.jpg'))]
    
    images, edge_maps = [], []
    transform = transforms.Compose([
        transforms.Resize((resolution, resolution)),
        transforms.ToTensor(),
        transforms.Normalize([0.5], [0.5])
    ])
    
    for img_path in image_files:
        img = cv2.imread(img_path, cv2.IMREAD_GRAYSCALE)
        img_rgb = cv2.imread(img_path)
        img_rgb = cv2.cvtColor(img_rgb, cv2.COLOR_BGR2RGB)

        # 生成 Canny 邊緣圖
        edges = cv2.Canny(img, 100, 200)
        edges = cv2.cvtColor(edges, cv2.COLOR_GRAY2RGB)

        images.append(transform(Image.fromarray(img_rgb)))
        edge_maps.append(transform(Image.fromarray(edges)))

    return torch.stack(images), torch.stack(edge_maps)

train_images, train_edges = preprocess_images(DATASET_DIR)
print(f"✅ 加載 {train_images.shape[0]} 張訓練圖片")

訓練 LoRA

6⃣ 構建 LoRA 設定

from peft import LoraConfig, get_peft_model

LORA_RANK = 8  # LoRA 低秩維度
lora_config = LoraConfig(
    r=LORA_RANK, lora_alpha=16, target_modules=["to_q", "to_k", "to_v"], lora_dropout=0.1, bias="none"
)

pipe.unet = get_peft_model(pipe.unet, lora_config)
pipe.unet.print_trainable_parameters()

7⃣ 訓練迴圈

from torch import nn
from torch.optim import AdamW
from tqdm import tqdm

BATCH_SIZE = 2
EPOCHS = 1
LEARNING_RATE = 1e-4

device = "cuda"
optimizer = AdamW(pipe.unet.parameters(), lr=LEARNING_RATE)
dtype = torch.float16

print("🚀 開始訓練 LoRA...")
for epoch in range(EPOCHS):
    loop = tqdm(range(0, len(train_images), BATCH_SIZE), desc=f"Epoch {epoch+1}/{EPOCHS}")
    for i in loop:
        batch = train_images[i:i + BATCH_SIZE].to(device).to(dtype)
        edges = train_edges[i:i + BATCH_SIZE].to(device).to(dtype)

        # 生成文本嵌入
        text_inputs = pipe.tokenizer(
            ["Zen Maru Gothic Style"] * BATCH_SIZE, 
            padding="max_length", 
            max_length=77, 
            return_tensors="pt"
        ).to(device)
        text_embeddings = pipe.text_encoder(text_inputs.input_ids)[0].to(dtype)

        # 產生隨機噪聲
        noise = torch.randn_like(batch).to(dtype)
        
        # ControlNet 輸入
        controlnet_output = controlnet(edges, timesteps=None, encoder_hidden_states=text_embeddings)

        # U-Net 預測
        output = pipe.unet(batch, timesteps=None, encoder_hidden_states=text_embeddings, controlnet=controlnet_output).sample

        # 計算 loss
        loss = nn.functional.mse_loss(output, noise)
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        loop.set_postfix(loss=loss.item())

8⃣ 儲存 LoRA 權重

output_path = os.path.join(OUTPUT_DIR, "zenmaru_lora.pth")
torch.save(pipe.unet.state_dict(), output_path)
print(f"✅ 訓練完成，模型已儲存至 {output_path}")

推論生成缺字

9⃣ 載入訓練好的 LoRA

pipe.unet.load_state_dict(torch.load(output_path))
pipe.to(device)

使用 Nato Sans 生成 Zenmaru 缺字

def generate_missing_character(nato_image_path):
    # 讀取 Nato Sans 字體圖片
    nato_image = Image.open(nato_image_path).convert("RGB")
    nato_tensor = transforms.ToTensor()(nato_image).unsqueeze(0).to(device).to(dtype)

    # 轉換成 Canny 邊緣
    nato_gray = cv2.cvtColor(np.array(nato_image), cv2.COLOR_RGB2GRAY)
    nato_edges = cv2.Canny(nato_gray, 100, 200)
    nato_edges = Image.fromarray(nato_edges)
    nato_edges_tensor = transforms.ToTensor()(nato_edges).unsqueeze(0).to(device).to(dtype)

    # 生成 Zenmaru 缺字
    with torch.no_grad():
        result = pipe(
            prompt="Zen Maru Gothic Style",
            image=nato_edges_tensor,
            guidance_scale=7.5
        ).images[0]

    return result

generated_image = generate_missing_character("/content/drive/MyDrive/AI/datasets/nato_sans_missing.png")
generated_image.save("/content/drive/MyDrive/AI/output/zenmaru_missing_character.png")

總結

使用 Stable Diffusion 1.5 搭配 ControlNet (control_v11p_sd15_canny) 來學習 Nato Sans 和 Zen Maru Gothic。
訓練 LoRA 來微調模型，使其能夠生成 Zen Maru 缺失的字元。
利用 Canny 邊緣檢測 提取 Nato Sans 字型輪廓，並推理缺字。

這樣你就可以用 Nato Sans 來補充 Zen Maru Style 缺失的字元了！

重新挑戰, 提示詞:

Colab 上訓練 Stable Diffusion 1.5 + ControlNet (control_v11p_sd15_canny)，以學習 Nato Sans 和 Zen Maru Gothic 字型風格, 利用 Canny 邊緣檢測 提取 Nato Sans 字型輪廓，並推理Zen Maru Gothic 字型缺字。
* LoRA 訓練 
* 用 bitsandbytes 提供的 AdamW8bit 優化器來降低顯存消耗
* float 16 混合精度 
* 將載入圖片 轉換為 4 通道的圖像
* Canny 邊緣檢測器 使用 controlnet_aux
* 訓練圖片路徑 DATASET_DIR = "/root/AI/datasets/zenmaru_dataset/instance_images"
* 訓練圖片路徑之下有 .txt 檔, 一併讀取 captions
請提供訓練用完整script

這裡是完整的訓練腳本，適用於 Google Colab，訓練 Stable Diffusion 1.5 + ControlNet (control_v11p_sd15_canny)，學習 Noto Sans 字型輪廓 + Zen Maru Gothic 字型補全。

主要特性

LoRA 訓練（減少顯存需求）
8-bit AdamW 優化器（bitsandbytes）
float16 混合精度（適用於T4）
Canny 邊緣檢測（使用 controlnet_aux）
圖像轉換為 4 通道格式（RGB + Canny 邊緣圖）
讀取 .txt 標註作為 captions
Dataset 路徑：/root/AI/datasets/zenmaru_dataset/instance_images

安裝必要的套件

!pip install diffusers controlnet_aux bitsandbytes transformers mediapipe accelerate datasets

訓練腳本1

import os
import torch
import bitsandbytes as bnb
import cv2
import numpy as np
from PIL import Image
from diffusers import StableDiffusionPipeline, ControlNetModel, UNet2DConditionModel
from transformers import CLIPTokenizer
from controlnet_aux import CannyDetector
from datasets import load_dataset

# 設定參數
dataset_dir = "/root/AI/datasets/zenmaru_dataset/instance_images"
output_dir = "./lora_output"
device = "cuda" if torch.cuda.is_available() else "cpu"

# 載入 Stable Diffusion 1.5 + ControlNet
base_model = "runwayml/stable-diffusion-v1-5"
controlnet_model = "lllyasviel/control_v11p_sd15_canny"

pipe = StableDiffusionPipeline.from_pretrained(base_model, torch_dtype=torch.float16).to(device)
controlnet = ControlNetModel.from_pretrained(controlnet_model, torch_dtype=torch.float16).to(device)
unet = UNet2DConditionModel.from_pretrained(base_model, subfolder="unet", torch_dtype=torch.float16).to(device)

# 優化器
optimizer = bnb.optim.AdamW8bit(unet.parameters(), lr=1e-4)

# Canny 邊緣檢測器
canny = CannyDetector()

def process_image(image_path):
    image = Image.open(image_path).convert("RGBA")
    image = np.array(image)
    canny_image = canny(image)
    canny_image = Image.fromarray(canny_image).convert("RGBA")
    return canny_image

# 讀取 dataset
image_paths = [os.path.join(dataset_dir, f) for f in os.listdir(dataset_dir) if f.endswith(".png") or f.endswith(".jpg")]
captions = {}
for img_path in image_paths:
    txt_path = img_path.replace(".png", ".txt").replace(".jpg", ".txt")
    if os.path.exists(txt_path):
        with open(txt_path, "r", encoding="utf-8") as f:
            captions[img_path] = f.read().strip()

from transformers import CLIPTokenizerFast
from transformers import CLIPTextModel
from diffusers import AutoencoderKL
import torch.nn.functional as F

# 載入 VAE
vae = AutoencoderKL.from_pretrained(base_model, subfolder="vae").to(device, dtype=torch.float16)

# 訓練迴圈
epochs = 5
for epoch in range(epochs):
    for img_path in image_paths:
        caption = captions.get(img_path, "")
        input_image = process_image(img_path)
        
        input_tensor = pipe.feature_extractor(input_image, return_tensors="pt").pixel_values.to(device, dtype=torch.float16)
        input_tensor = vae.encode(input_tensor).latent_dist.sample()  # 取得潛在變數
        input_tensor = input_tensor * 0.18215  # Stable Diffusion 預設縮放係數

        tokenizer = CLIPTokenizerFast.from_pretrained("openai/clip-vit-large-patch14")
        caption_tensor = tokenizer(caption, return_tensors="pt").input_ids.to(device)

        optimizer.zero_grad()
        
        text_encoder = CLIPTextModel.from_pretrained("openai/clip-vit-large-patch14").to(device, dtype=torch.float16)
        text_embeddings = text_encoder(caption_tensor).last_hidden_state

        # 取得隨機 timestep
        timesteps = torch.randint(0, 1000, (1,), dtype=torch.long, device=device)  # 取隨機 timestep

        # 生成 ground truth noise
        target_noise = torch.randn_like(input_tensor)

        # 在原圖上加上 noise
        noisy_input = input_tensor + target_noise
        
        output = unet(noisy_input, timesteps, encoder_hidden_states=text_embeddings)
        predicted_noise = output.sample

        # 計算 Loss (MSE Loss)
        loss = F.mse_loss(predicted_noise, target_noise)
        loss.backward()
        optimizer.step()
    
    print(f"Epoch {epoch+1}/{epochs}, Loss: {loss.item()}")

# 儲存訓練結果
unet.save_pretrained(output_dir)
controlnet.save_pretrained(output_dir)
print("Training complete. Model saved.")

滿神奇的, 居然可以產生出這麼多, 版本不一樣的script, 輸入 512×512 圖片進行訓練, 上面的script 需要 9.6 / 15.0 GB RAM:

訓練腳本2

import os
import torch
import cv2
import numpy as np
from PIL import Image
from torch.utils.data import Dataset, DataLoader
from transformers import CLIPTokenizer
from diffusers import StableDiffusionControlNetPipeline, ControlNetModel, DDPMScheduler
from diffusers.utils import load_image
import bitsandbytes as bnb

# 設定資料集路徑
DATASET_DIR = "/root/AI/datasets/zenmaru_dataset/instance_images"

# 讀取圖片並轉換為 4 通道格式 (RGB + Canny)
class FontDataset(Dataset):
    def __init__(self, dataset_dir):
        self.dataset_dir = dataset_dir
        self.image_paths = [os.path.join(dataset_dir, f) for f in os.listdir(dataset_dir) if f.endswith(('.png', '.jpg', '.jpeg'))]
        self.tokenizer = CLIPTokenizer.from_pretrained("openai/clip-vit-large-patch14")
        
    def __len__(self):
        return len(self.image_paths)

    def __getitem__(self, idx):
        image_path = self.image_paths[idx]

        # 讀取圖片
        image = Image.open(image_path).convert("RGB")
        image = np.array(image)

        # 轉換 Canny 邊緣圖
        canny = cv2.Canny(image, 100, 200)  # 生成 Canny 邊緣
        canny = np.expand_dims(canny, axis=-1)  # 擴展維度
        image_4ch = np.concatenate([image, canny], axis=-1)  # 合併為 4 通道格式
        
        # 讀取對應的 caption（如果存在）
        caption_path = image_path.replace(".png", ".txt").replace(".jpg", ".txt").replace(".jpeg", ".txt")
        if os.path.exists(caption_path):
            with open(caption_path, "r", encoding="utf-8") as f:
                caption = f.read().strip()
        else:
            caption = "Font style training image"
        
        # Tokenizer 處理 captions
        tokens = self.tokenizer(caption, padding="max_length", max_length=77, truncation=True, return_tensors="pt").input_ids
        
        # 轉換成 Tensor
        image_4ch = torch.tensor(image_4ch, dtype=torch.float32).permute(2, 0, 1) / 255.0
        return image_4ch, tokens.squeeze()

# 創建 Dataset & DataLoader
dataset = FontDataset(DATASET_DIR)
dataloader = DataLoader(dataset, batch_size=1, shuffle=True)

# 載入 Stable Diffusion 1.5 模型
base_model = "runwayml/stable-diffusion-v1-5"
controlnet = ControlNetModel.from_pretrained("lllyasviel/control_v11p_sd15_canny", torch_dtype=torch.bfloat16)
pipe = StableDiffusionControlNetPipeline.from_pretrained(
    base_model,
    controlnet=controlnet,
    torch_dtype=torch.bfloat16
)

# 啟用 xFormers 記憶體優化
pipe.enable_xformers_memory_efficient_attention()

# 設定訓練參數
optimizer = bnb.optim.AdamW8bit(pipe.unet.parameters(), lr=1e-4)
scheduler = DDPMScheduler.from_pretrained(base_model)

# 訓練 LoRA
from diffusers.training_utils import EMAModel

ema_model = EMAModel(pipe.unet)

device = "cuda" if torch.cuda.is_available() else "cpu"
pipe.to(device)
ema_model.to(device)

# 訓練 Loop
epochs = 5  # 訓練 5 個 Epoch
for epoch in range(epochs):
    for batch, (images, captions) in enumerate(dataloader):
        images = images.to(device, dtype=torch.bfloat16)
        captions = captions.to(device)

        optimizer.zero_grad()

        noise = torch.randn_like(images)
        timesteps = torch.randint(0, scheduler.num_train_timesteps, (images.shape[0],), device=device).long()
        noisy_images = scheduler.add_noise(images, noise, timesteps)

        loss = pipe.unet(noisy_images, timesteps, captions).loss
        loss.backward()
        optimizer.step()

        print(f"Epoch {epoch+1}/{epochs}, Batch {batch+1}/{len(dataloader)}, Loss: {loss.item()}")

    ema_model.step(pipe.unet)

# 儲存 LoRA 權重
pipe.save_pretrained("/root/AI/output/zenmaru_lora")

設定說明

訓練 LoRA（降低記憶體需求）。
bitsandbytes 8-bit AdamW（減少顯存占用）。
float16 訓練（適合 Colab 免費 T4 GPU）。
使用 controlnet_aux 進行 Canny 邊緣檢測。
讀取 .txt 標註作為 captions。
儲存 LoRA 權重到 /root/AI/output/zenmaru_lora。

結論

這個腳本適合 Google Colab，不會因為記憶體不足而崩潰
LoRA 訓練讓顯存需求變低，可以在 T4 上運行
8-bit AdamW，最大化效能與穩定性

這樣你就可以成功訓練 SD 1.5 + ControlNet 1.5，學習 Noto Sans 輪廓 + Zen Maru Gothic 字型補全！

Colab 的 GPU 記憶體並不會自動釋放，尤其是當發生 OutOfMemoryError 時

max-stackoverflow — Fri, 28 Feb 2025 09:46:59 +0000

在 Google Colab 中，GPU 記憶體並不會自動釋放，尤其是當發生 OutOfMemoryError 時，這可能會導致內存碎片或無法釋放的記憶體。要手動重設 GPU 記憶體，有幾種方法可以嘗試：

1. 重新啟動 Colab 內核

這是最簡單且最直接的方式，可以清除所有的記憶體分配：

點擊 Colab 介面上的 “Runtime”。
選擇 “Restart runtime”（重啟運行時）。

這樣可以清除所有進程並釋放 GPU 記憶體。

2. 手動釋放 GPU 記憶體

如果你不想重啟整個內核，可以手動清除一些 GPU 記憶體。試試以下方法：

使用 `torch.cuda.empty_cache()`

這個方法會清除未使用的記憶體，將它釋放回 GPU，但是不會完全重設 GPU 記憶體，這對釋放一些內存是有效的。

import torch
torch.cuda.empty_cache()

釋放已經不再需要的變數

如果你創建了不再需要的張量，可以顯式地刪除它們並清除內存：

del variable_name  # 刪除不再需要的變數
torch.cuda.empty_cache()  # 清除不再使用的內存

使用 `gc.collect()`

Python 的垃圾回收器有時無法及時清理不再使用的內存，手動觸發它可以幫助釋放記憶體。

import gc
gc.collect()
torch.cuda.empty_cache()

3. 使用環境變數 `PYTORCH_CUDA_ALLOC_CONF`

如果你在使用 PyTorch 時遇到記憶體分配問題，可以通過設置環境變數來調整 CUDA 記憶體的分配行為。這有時候可以幫助避免 GPU 記憶體碎片化。

import os
os.environ['PYTORCH_CUDA_ALLOC_CONF'] = 'max_split_size_mb:128'

這會設置 PyTorch 對內存的分配大小限制，從而減少碎片化的機會。

4. 減少 GPU 記憶體占用

在訓練模型時，若 GPU 記憶體不足，還可以考慮調整以下參數來減少內存占用：

減小 batch size：減少每次處理的數據量。
減少模型大小：嘗試較小的模型架構或參數。
混合精度訓練：使用 float16 精度來訓練模型，這樣可以節省大量記憶體。

# Example of using mixed precision
from torch.cuda.amp import autocast

with autocast():
    output = model(input)

通過這些步驟，你可以釋放一些 GPU 記憶體，從而避免再度發生 OutOfMemoryError。

colab 上接續訓練的步驟

max-stackoverflow — Fri, 28 Feb 2025 08:00:07 +0000

如果要接續訓練（Resume Training），你需要確保以下幾點：

之前訓練的權重檔已儲存（通常會在 OUTPUT_DIR 內）。
載入最新的 checkpoint（上次訓練時的最後一步）。
繼續 Trainer 訓練，而不是從頭開始。

接續訓練的完整步驟

1⃣ 掛載 Google Drive 並確認已訓練的模型

from google.colab import drive
import os

drive.mount('/content/drive')

# 設定 Google Drive 內的數據集與輸出目錄
OUTPUT_DIR = "/content/drive/MyDrive/AI/output/zenmaru_finetune"

# 檢查上次訓練的 checkpoint 是否存在
checkpoint_path = os.path.join(OUTPUT_DIR, "checkpoint-last")

if os.path.exists(checkpoint_path):
    print(f"✅ 找到上次訓練的 checkpoint: {checkpoint_path}")
else:
    print("❌ 沒有找到 checkpoint，請確認上次訓練是否有儲存！")

2⃣ 重新載入 Stable Diffusion 2.1 模型

from diffusers import StableDiffusionPipeline, UNet2DConditionModel

PRETRAINED_MODEL_NAME = "stabilityai/stable-diffusion-2-1"

# 載入上次訓練的 checkpoint，而不是從頭開始
unet = UNet2DConditionModel.from_pretrained(
    checkpoint_path,  # 這裡載入上次訓練的 checkpoint
    torch_dtype=torch.float16
).to("cuda")

3⃣ 設定接續訓練的參數

from transformers import TrainingArguments, Trainer
from datasets import load_dataset

# 設定訓練參數
training_args = TrainingArguments(
    output_dir=OUTPUT_DIR,  # 繼續存放到同一個目錄
    per_device_train_batch_size=1,
    gradient_accumulation_steps=4,  
    learning_rate=5e-6,
    max_steps=20000,  # 這裡設定新總步數（包括上次訓練的步數）
    save_strategy="steps",
    save_steps=1000,
    resume_from_checkpoint=True  # ✅ 啟用接續訓練
)

# 重新載入數據集
DATASET_DIR = "/content/drive/MyDrive/AI/datasets/zenmaru_dataset"
dataset = load_dataset("imagefolder", data_dir=DATASET_DIR, split="train")

# 設定 Trainer
trainer = Trainer(
    model=unet,
    args=training_args,
    train_dataset=dataset
)

4⃣ 繼續訓練

trainer.train(resume_from_checkpoint=True)  # ✅ 這行確保會從 checkpoint 繼續訓練

確保訓練結果不會遺失

建議在 Google Drive 上手動檢查 /AI/output/zenmaru_finetune 內是否有：

checkpoint-last/ 目錄（裡面有 .bin 或 .json 檔案）。
pytorch_model.bin 或 .safetensors（訓練後的模型）。

如果你想手動儲存最新的模型：

trainer.save_model(OUTPUT_DIR)

總結

載入上次訓練的 checkpoint（確保 checkpoint-last 存在）。
設定 resume_from_checkpoint=True 讓訓練從中斷的地方繼續。
開始訓練 trainer.train(resume_from_checkpoint=True)。

這樣就能在 Colab 繼續 Fine-tuning 了！
你上次訓練到幾步呢？

在 colab 上, 要如何下載 stabilityai/stable-diffusion-2-1

max-stackoverflow — Fri, 28 Feb 2025 07:28:29 +0000

在 Google Colab 上下載 stabilityai/stable-diffusion-2-1 預訓練模型，你可以使用 diffusers 函式庫提供的 DiffusionPipeline.from_pretrained() 方法。以下是詳細步驟與程式碼範例：

1. 安裝必要的函式庫

首先，確保你已經安裝了 diffusers、transformers 和 torch 函式庫。如果沒有，可以使用以下指令安裝：

Bash

!pip install diffusers transformers torch

2. 下載預訓練模型

使用 DiffusionPipeline.from_pretrained() 方法下載模型。這個方法會自動從 Hugging Face Model Hub 下載模型，並將其儲存在 Colab 的快取目錄中。

Python

from diffusers import DiffusionPipeline

PRETRAINED_MODEL_NAME = "stabilityai/stable-diffusion-2-1"

# 下載模型
pipeline = DiffusionPipeline.from_pretrained(PRETRAINED_MODEL_NAME)

# 將模型移動到 GPU（如果可用）
pipeline = pipeline.to("cuda")

3. 使用模型進行推理（可選）

下載模型後，你可以立即使用它進行圖片生成。以下是一個簡單的範例：

Python

prompt = "a futuristic cityscape, digital art"
image = pipeline(prompt).images[0]

# 顯示生成的圖片
image.show()

# 或者儲存圖片
image.save("generated_image.png")

完整程式碼範例：

Python

!pip install diffusers transformers torch

from diffusers import DiffusionPipeline

PRETRAINED_MODEL_NAME = "stabilityai/stable-diffusion-2-1"

# 下載模型
pipeline = DiffusionPipeline.from_pretrained(PRETRAINED_MODEL_NAME)

# 將模型移動到 GPU（如果可用）
pipeline = pipeline.to("cuda")

prompt = "a futuristic cityscape, digital art"
image = pipeline(prompt).images[0]

# 顯示生成的圖片
image.show()

# 或者儲存圖片
image.save("generated_image.png")

注意事項：

GPU 加速：
- 為了加快推理速度，建議使用 GPU。請確保你的 Colab 運行時已選擇 GPU。
- pipeline.to("cuda") 會將模型移動到 GPU。如果你的 Colab 沒有 GPU，請移除這行程式碼。
模型快取：
- 模型下載後會儲存在 Colab 的快取目錄中。下次運行時，如果模型已經存在，就不會重新下載。
Hugging Face 權杖：
- 某些模型可能需要 Hugging Face 權杖才能下載。如果遇到權限問題，請確保你已登入 Hugging Face，並在 Colab 中設定你的權杖。
記憶體問題：
- Stable Diffusion 模型較大，可能需要較多的 GPU 記憶體。如果遇到記憶體不足的問題，可以嘗試使用較小的模型或降低圖片生成的大小。

希望這些資訊能幫助你在 Colab 上成功下載 stabilityai/stable-diffusion-2-1 模型。

清空 google drive 垃圾桶的內容

max-stackoverflow — Tue, 11 Feb 2025 14:27:52 +0000

在 colab 上訓練模型, 如果沒有定時把結果存回 google drive, 那運算時間結束時, 資料也都會消失, 大約每 10分鐘左右, 會把資料存回來一次.

由於 google 容量又很小, checkpoint 存太多把 google drive 裝滿, 運算結果存不進去, 資料也都會消失, 中間運算結果不介意可以刪除, 只想保留最後的幾筆.

透過 shell command 或 python script 刪除 google drive 上的檔案, 固定都會被移動到 trash 目錄, 所以定期要清除 trash folder, 解法:

from googleapiclient.discovery import build
from google.colab import auth

# 1. 身份驗證
auth.authenticate_user()

# 2. 建立 Google Drive API 服務
drive_service = build('drive', 'v3')

try:
    # 清空 google drive垃圾桶
    response = drive_service.files().emptyTrash().execute()
    #print("google drive垃圾桶已清空。")
except Exception as e:
  print(f"發生錯誤：{e}")

上面程式如果是寫進 .py 的檔案, 測試不會彈出授權視窗, 在沒有授權成功的情況下, 無法清除trash can,

所以 step 1 & step 2 要放在 colab 的 notebook 下執行, 如下圖所示:

在有授權的情況下, 之後的 !python 執行 python script 的檔案, 就可以正確地清除 trash can.

Saving and Loading Your Model to Resume Training in PyTorch

max-stackoverflow — Sun, 06 Jun 2021 14:47:03 +0000

使用 colab 的正確姿勢之一，就是要無縫的接續訓練，在 pytorch resume training 的方法如下。

Saving a Checkpoint

import torch
import shutil
def save_ckp(state, is_best, checkpoint_dir, best_model_dir):
     f_path = checkpoint_dir / 'checkpoint.pt'
     torch.save(state, f_path)
     if is_best:
         best_fpath = best_model_dir / 'best_model.pt'
         shutil.copyfile(f_path, best_fpath)

checkpoint = {
     'epoch': epoch + 1,
     'state_dict': model.state_dict(),
     'optimizer': optimizer.state_dict()
 }
save_ckp(checkpoint, is_best, checkpoint_dir, model_dir)

附註：有些人會使用 .pth 為副檔名。

Loading a checkpoint

def load_ckp(checkpoint_fpath, model, optimizer):
     checkpoint = torch.load(checkpoint_fpath)
     model.load_state_dict(checkpoint['state_dict'])
     optimizer.load_state_dict(checkpoint['optimizer'])
     return model, optimizer, checkpoint['epoch']

model = MyModel(*args, **kwargs)
optimizer = optim.Adam(model.parameters(), lr=learning_rate)
ckp_path = "path/to/checkpoint/checkpoint.pt"
model, optimizer, start_epoch = load_ckp(ckp_path, model, optimizer)

資料來源：
https://medium.com/analytics-vidhya/saving-and-loading-your-model-to-resume-training-in-pytorch-cb687352fa61

[Pytorch] GPU 與 CPU 的切換

max-stackoverflow — Sat, 05 Jun 2021 10:01:03 +0000

使用 google colab 執行了幾個小時後，就彈出這一個畫面，說不能用 GPU 了。

Google 的「𥇧解詳情」：
https://research.google.com/colaboratory/faq.html#usage-limits

What are the usage limits of Colab?

Colab is able to provide free resources in part by having dynamic usage limits that sometimes fluctuate, and by not providing guaranteed or unlimited resources. This means that overall usage limits as well as idle timeout periods, maximum VM lifetime, GPU types available, and other factors vary over time. Colab does not publish these limits, in part because they can (and sometimes do) vary quickly.

GPUs and TPUs are sometimes prioritized for users who use Colab interactively rather than for long-running computations, or for users who have recently used less resources in Colab. As a result, users who use Colab for long-running computations, or users who have recently used more resources in Colab, are more likely to run into usage limits and have their access to GPUs and TPUs temporarily restricted. Users with high computational needs may be interested in using Colab’s UI with a local runtime running on their own hardware. Users interested in having higher and more stable usage limits may be interested in Colab Pro.

只好切換目前程式，改用 CPU，如果沒有調整程式，直接再執行一次會顯示下面的錯誤訊息：

File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/torch/serialization.py", line 135, in validate_cuda_device
     raise RuntimeError('Attempting to deserialize object on a CUDA '
 RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.

切換的方式，滿簡單的，找到 torch.load 再增加參數：

map_location=torch.device('cpu')

問題就解決了，相關的語法：
https://pytorch.org/tutorials/beginner/saving_loading_models.html

When it comes to saving and loading models, there are three core functions to be familiar with:

torch.save: Saves a serialized object to disk. This function uses Python’s pickle utility for serialization. Models, tensors, and dictionaries of all kinds of objects can be saved using this function.
torch.load: Uses pickle’s unpickling facilities to deserialize pickled object files to memory. This function also facilitates the device to load the data into (see Saving & Loading Model Across Devices).
torch.nn.Module.load_state_dict: Loads a model’s parameter dictionary using a deserialized state_dict. For more information on state_dict, see What is a state_dict?.

Contents:

What is a `state_dict`?

In PyTorch, the learnable parameters (i.e. weights and biases) of an torch.nn.Module model are contained in the model’s parameters (accessed with model.parameters()). A state_dict is simply a Python dictionary object that maps each layer to its parameter tensor. Note that only layers with learnable parameters (convolutional layers, linear layers, etc.) and registered buffers (batchnorm’s running_mean) have entries in the model’s state_dict. Optimizer objects (torch.optim) also have a state_dict, which contains information about the optimizer’s state, as well as the hyperparameters used.

Because state_dict objects are Python dictionaries, they can be easily saved, updated, altered, and restored, adding a great deal of modularity to PyTorch models and optimizers.

Example:

Let’s take a look at the state_dict from the simple model used in the Training a classifier tutorial.

# Define model
class TheModelClass(nn.Module):
    def __init__(self):
        super(TheModelClass, self).__init__()
        self.conv1 = nn.Conv2d(3, 6, 5)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.fc1 = nn.Linear(16 * 5 * 5, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = x.view(-1, 16 * 5 * 5)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x

# Initialize model
model = TheModelClass()

# Initialize optimizer
optimizer = optim.SGD(model.parameters(), lr=0.001, momentum=0.9)

# Print model's state_dict
print("Model's state_dict:")
for param_tensor in model.state_dict():
    print(param_tensor, "\t", model.state_dict()[param_tensor].size())

# Print optimizer's state_dict
print("Optimizer's state_dict:")
for var_name in optimizer.state_dict():
    print(var_name, "\t", optimizer.state_dict()[var_name])

Output:

Model's state_dict:
conv1.weight     torch.Size([6, 3, 5, 5])
conv1.bias   torch.Size([6])
conv2.weight     torch.Size([16, 6, 5, 5])
conv2.bias   torch.Size([16])
fc1.weight   torch.Size([120, 400])
fc1.bias     torch.Size([120])
fc2.weight   torch.Size([84, 120])
fc2.bias     torch.Size([84])
fc3.weight   torch.Size([10, 84])
fc3.bias     torch.Size([10])

Optimizer's state_dict:
state    {}
param_groups     [{'lr': 0.001, 'momentum': 0.9, 'dampening': 0, 'weight_decay': 0, 'nesterov': False, 'params': [4675713712, 4675713784, 4675714000, 4675714072, 4675714216, 4675714288, 4675714432, 4675714504, 4675714648, 4675714720]}]

Saving & Loading Model for Inference

Save/Load `state_dict` (Recommended)

Save:

torch.save(model.state_dict(), PATH)

Load:

model = TheModelClass(*args, **kwargs)
model.load_state_dict(torch.load(PATH))
model.eval()

NOTE

The 1.6 release of PyTorch switched torch.save to use a new zipfile-based file format. torch.load still retains the ability to load files in the old format. If for any reason you want torch.save to use the old format, pass the kwarg _use_new_zipfile_serialization=False.

When saving a model for inference, it is only necessary to save the trained model’s learned parameters. Saving the model’s state_dict with the torch.save() function will give you the most flexibility for restoring the model later, which is why it is the recommended method for saving models.

A common PyTorch convention is to save models using either a .pt or .pth file extension.

Remember that you must call model.eval() to set dropout and batch normalization layers to evaluation mode before running inference. Failing to do this will yield inconsistent inference results.

NOTE

Notice that the load_state_dict() function takes a dictionary object, NOT a path to a saved object. This means that you must deserialize the saved state_dict before you pass it to the load_state_dict() function. For example, you CANNOT load using model.load_state_dict(PATH).

NOTE

If you only plan to keep the best performing model (according to the acquired validation loss), don’t forget that best_model_state = model.state_dict() returns a reference to the state and not its copy! You must serialize best_model_state or use best_model_state = deepcopy(model.state_dict()) otherwise your best best_model_state will keep getting updated by the subsequent training iterations. As a result, the final model state will be the state of the overfitted model.

Save/Load Entire Model

Save:

torch.save(model, PATH)

Load:

# Model class must be defined somewhere
model = torch.load(PATH)
model.eval()

This save/load process uses the most intuitive syntax and involves the least amount of code. Saving a model in this way will save the entire module using Python’s pickle module. The disadvantage of this approach is that the serialized data is bound to the specific classes and the exact directory structure used when the model is saved. The reason for this is because pickle does not save the model class itself. Rather, it saves a path to the file containing the class, which is used during load time. Because of this, your code can break in various ways when used in other projects or after refactors.

A common PyTorch convention is to save models using either a .pt or .pth file extension.

Remember that you must call model.eval() to set dropout and batch normalization layers to evaluation mode before running inference. Failing to do this will yield inconsistent inference results.

Saving & Loading a General Checkpoint for Inference and/or Resuming Training

Save:

torch.save({
            'epoch': epoch,
            'model_state_dict': model.state_dict(),
            'optimizer_state_dict': optimizer.state_dict(),
            'loss': loss,
            ...
            }, PATH)

Load:

model = TheModelClass(*args, **kwargs)
optimizer = TheOptimizerClass(*args, **kwargs)

checkpoint = torch.load(PATH)
model.load_state_dict(checkpoint['model_state_dict'])
optimizer.load_state_dict(checkpoint['optimizer_state_dict'])
epoch = checkpoint['epoch']
loss = checkpoint['loss']

model.eval()
# - or -
model.train()

When saving a general checkpoint, to be used for either inference or resuming training, you must save more than just the model’s state_dict. It is important to also save the optimizer’s state_dict, as this contains buffers and parameters that are updated as the model trains. Other items that you may want to save are the epoch you left off on, the latest recorded training loss, external torch.nn.Embedding layers, etc. As a result, such a checkpoint is often 2~3 times larger than the model alone.

To save multiple components, organize them in a dictionary and use torch.save() to serialize the dictionary. A common PyTorch convention is to save these checkpoints using the .tar file extension.

To load the items, first initialize the model and optimizer, then load the dictionary locally using torch.load(). From here, you can easily access the saved items by simply querying the dictionary as you would expect.

Remember that you must call model.eval() to set dropout and batch normalization layers to evaluation mode before running inference. Failing to do this will yield inconsistent inference results. If you wish to resuming training, call model.train() to ensure these layers are in training mode.

Saving Multiple Models in One File

Save:

torch.save({
            'modelA_state_dict': modelA.state_dict(),
            'modelB_state_dict': modelB.state_dict(),
            'optimizerA_state_dict': optimizerA.state_dict(),
            'optimizerB_state_dict': optimizerB.state_dict(),
            ...
            }, PATH)

Load:

modelA = TheModelAClass(*args, **kwargs)
modelB = TheModelBClass(*args, **kwargs)
optimizerA = TheOptimizerAClass(*args, **kwargs)
optimizerB = TheOptimizerBClass(*args, **kwargs)

checkpoint = torch.load(PATH)
modelA.load_state_dict(checkpoint['modelA_state_dict'])
modelB.load_state_dict(checkpoint['modelB_state_dict'])
optimizerA.load_state_dict(checkpoint['optimizerA_state_dict'])
optimizerB.load_state_dict(checkpoint['optimizerB_state_dict'])

modelA.eval()
modelB.eval()
# - or -
modelA.train()
modelB.train()

When saving a model comprised of multiple torch.nn.Modules, such as a GAN, a sequence-to-sequence model, or an ensemble of models, you follow the same approach as when you are saving a general checkpoint. In other words, save a dictionary of each model’s state_dict and corresponding optimizer. As mentioned before, you can save any other items that may aid you in resuming training by simply appending them to the dictionary.

A common PyTorch convention is to save these checkpoints using the .tar file extension.

To load the models, first initialize the models and optimizers, then load the dictionary locally using torch.load(). From here, you can easily access the saved items by simply querying the dictionary as you would expect.

Warmstarting Model Using Parameters from a Different Model

Save:

torch.save(modelA.state_dict(), PATH)

Load:

modelB = TheModelBClass(*args, **kwargs)
modelB.load_state_dict(torch.load(PATH), strict=False)

Partially loading a model or loading a partial model are common scenarios when transfer learning or training a new complex model. Leveraging trained parameters, even if only a few are usable, will help to warmstart the training process and hopefully help your model converge much faster than training from scratch.

Whether you are loading from a partial state_dict, which is missing some keys, or loading a state_dict with more keys than the model that you are loading into, you can set the strict argument to False in the load_state_dict() function to ignore non-matching keys.

If you want to load parameters from one layer to another, but some keys do not match, simply change the name of the parameter keys in the state_dict that you are loading to match the keys in the model that you are loading into.

Saving & Loading Model Across Devices

Save on GPU, Load on CPU

Save:

torch.save(model.state_dict(), PATH)

Load:

device = torch.device('cpu')
model = TheModelClass(*args, **kwargs)
model.load_state_dict(torch.load(PATH, map_location=device))

When loading a model on a CPU that was trained with a GPU, pass torch.device('cpu') to the map_location argument in the torch.load() function. In this case, the storages underlying the tensors are dynamically remapped to the CPU device using the map_location argument.

Save on GPU, Load on GPU

Save:

torch.save(model.state_dict(), PATH)

Load:

device = torch.device("cuda")
model = TheModelClass(*args, **kwargs)
model.load_state_dict(torch.load(PATH))
model.to(device)
# Make sure to call input = input.to(device) on any input tensors that you feed to the model

When loading a model on a GPU that was trained and saved on GPU, simply convert the initialized model to a CUDA optimized model using model.to(torch.device('cuda')). Also, be sure to use the .to(torch.device('cuda')) function on all model inputs to prepare the data for the model. Note that calling my_tensor.to(device) returns a new copy of my_tensor on GPU. It does NOT overwrite my_tensor. Therefore, remember to manually overwrite tensors: my_tensor = my_tensor.to(torch.device('cuda')).

Save on CPU, Load on GPU

Save:

torch.save(model.state_dict(), PATH)

Load:

device = torch.device("cuda")
model = TheModelClass(*args, **kwargs)
model.load_state_dict(torch.load(PATH, map_location="cuda:0"))  # Choose whatever GPU device number you want
model.to(device)
# Make sure to call input = input.to(device) on any input tensors that you feed to the model

When loading a model on a GPU that was trained and saved on CPU, set the map_location argument in the torch.load() function to cuda:device_id. This loads the model to a given GPU device. Next, be sure to call model.to(torch.device('cuda')) to convert the model’s parameter tensors to CUDA tensors. Finally, be sure to use the .to(torch.device('cuda')) function on all model inputs to prepare the data for the CUDA optimized model. Note that calling my_tensor.to(device) returns a new copy of my_tensor on GPU. It does NOT overwrite my_tensor. Therefore, remember to manually overwrite tensors: my_tensor = my_tensor.to(torch.device('cuda')).

Saving `torch.nn.DataParallel` Models

Save:

torch.save(model.module.state_dict(), PATH)

Load:

# Load to whatever device you want

torch.nn.DataParallel is a model wrapper that enables parallel GPU utilization. To save a DataParallel model generically, save the model.module.state_dict(). This way, you have the flexibility to load the model any way you want to any device you want.

測試Colab提供的免費TPU

max-stackoverflow — Fri, 04 Jun 2021 17:58:22 +0000

從其他網頁上收集到的資訊來看，應該多去使用免費的TPU，因為執行的時間有可能只需要原本的訓練時間的 1/3，缺點是程式碼需要一點點的微調。參考網頁說明：

Colab提供了免費TPU
https://www.itread01.com/elpqhk.html

參考性佳的範例：
Getting Started with PyTorch on Cloud TPUs
https://medium.com/pytorch/get-started-with-pytorch-cloud-tpus-and-colab-a24757b8f7fc

Get started with our Colab Tutorials

Note: These colab notebooks typically run on small machines (the Compute VMs, which runs the input pipeline) and training is often bottlenecked on the small Compute VM machines. For optimal performance create a GCP VM and TPU pair following our GCP Tutorials:

實際測試結果，在 import xla_model 時，可以看到去初始化 TPU的訊息，最後成功取得 TPU 資源。

TPU 與 cuda 似乎是不同的世界，所以 CUDA 沒被啟用。

[PyTorch] output with shape [1, 28, 28] doesn’t match the broadcast shape [3, 28, 28]

max-stackoverflow — Thu, 03 Jun 2021 20:42:56 +0000

會產生這一個錯誤，原因是圖片的格式造成，是彩色還是灰階（黑白）圖片。

Let me clarify, if the img has three channels, you should have three number for mean, for example, img is RGB, mean is [0.5, 0.5, 0.5], the normalize result is R * 0.5, G * 0.5, B * 0.5. If img is grey type that only one channel, so mean should be [0.5], the normalize result is R * 0.5

資料來源：
https://github.com/yunjey/pytorch-tutorial/issues/161

解法很多，需要配合程式變成彩色，或修改程式的流程。如果是在python 裡使用 PIL 讀取圖片，轉換方式如下：

img = Image.open(GIF_FILENAME)
rgbimg = Image.new("RGBA", img.size)
rgbimg.paste(img)
rgbimg.save('foo.jpg')

colab – Max的程式語言筆記

上傳多個檔案到 google drive 打包成一個檔案的解法有那些?

可行的打包方法

推薦方案：使用 ZIP 壓縮

步驟

(1) 在本機電腦打包

(2) 上傳 ZIP 到 Google Drive

(3) 在 Google Colab 解壓

其他方案比較

方案 2⃣：TAR.GZ 壓縮（適合超大數據集）

方案 3⃣：7z 壓縮（最高壓縮率，但上傳慢）

總結

Stable Diffusion 1.5 + ControlNet 學習 nato sans 與 zen maru gothic

環境準備

1⃣ 掛載 Google Drive

安裝相依套件

2⃣ 安裝 diffusers、transformers、accelerate、controlnet

下載 Stable Diffusion 1.5 與 ControlNet

3⃣ 下載 stable-diffusion-v1-5 模型

4⃣ 下載 ControlNet (control_v11p_sd15_canny)

資料預處理

5⃣ 載入並處理 Zenmaru Dataset

訓練 LoRA

6⃣ 構建 LoRA 設定

7⃣ 訓練迴圈

8⃣ 儲存 LoRA 權重

推論生成缺字

9⃣ 載入訓練好的 LoRA

使用 Nato Sans 生成 Zenmaru 缺字

總結

主要特性

安裝必要的套件

訓練腳本1

設定說明

結論

Colab 的 GPU 記憶體並不會自動釋放，尤其是當發生 OutOfMemoryError 時

1. 重新啟動 Colab 內核

2. 手動釋放 GPU 記憶體

使用 torch.cuda.empty_cache()

釋放已經不再需要的變數

使用 gc.collect()

3. 使用環境變數 PYTORCH_CUDA_ALLOC_CONF

4. 減少 GPU 記憶體占用

colab 上接續訓練的步驟

接續訓練的完整步驟

1⃣ 掛載 Google Drive 並確認已訓練的模型

2⃣ 重新載入 Stable Diffusion 2.1 模型

3⃣ 設定接續訓練的參數

4⃣ 繼續訓練

確保訓練結果不會遺失

總結

在 colab 上, 要如何下載 stabilityai/stable-diffusion-2-1

清空 google drive 垃圾桶的內容

Saving and Loading Your Model to Resume Training in PyTorch

Saving a Checkpoint

Loading a checkpoint

[Pytorch] GPU 與 CPU 的切換

What is a state_dict?

Example:

Saving & Loading Model for Inference

Save/Load state_dict (Recommended)

Save/Load Entire Model

Saving & Loading a General Checkpoint for Inference and/or Resuming Training

Save:

Load:

Saving Multiple Models in One File

Save:

Load:

Warmstarting Model Using Parameters from a Different Model

Save:

Load:

Saving & Loading Model Across Devices

Save on GPU, Load on CPU

Save on GPU, Load on GPU

Save on CPU, Load on GPU

Saving torch.nn.DataParallel Models

測試Colab提供的免費TPU

Get started with our Colab Tutorials

[PyTorch] output with shape [1, 28, 28] doesn’t match the broadcast shape [3, 28, 28]

2⃣ 安裝 `diffusers`、`transformers`、`accelerate`、`controlnet`

3⃣ 下載 `stable-diffusion-v1-5` 模型

4⃣ 下載 ControlNet (`control_v11p_sd15_canny`)

使用 `torch.cuda.empty_cache()`

使用 `gc.collect()`

3. 使用環境變數 `PYTORCH_CUDA_ALLOC_CONF`

What is a `state_dict`?

Save/Load `state_dict` (Recommended)

Saving `torch.nn.DataParallel` Models