# Background Import Guide - Import 380K Questions

## ✅ **HOÀN THÀNH!**

Hệ thống Background Job để import TẤT CẢ 380K questions từ EMS!

---

## 🎯 **ĐÃ TẠO:**

### **1. Database:**

✅ Bảng `ems_questions` - Lưu questions  
✅ Bảng `ems_import_progress` - Track progress job  

### **2. Models:**

✅ `EmsQuestion` - Question model  
✅ `EmsImportProgress` - Progress tracking model  

### **3. Job:**

✅ `ImportEmsQuestionsJob` - Background job  

### **4. APIs:**

| API | Endpoint | Mục đích |
|-----|----------|----------|
| Start Job | `POST /api/speakup/import-background` | Bắt đầu import background |
| Check Progress | `GET /api/speakup/import-progress/{jobId}` | Xem tiến trình |
| Cancel Job | `POST /api/speakup/import-cancel/{jobId}` | Hủy job đang chạy |
| List Jobs | `GET /api/speakup/import-jobs` | Danh sách tất cả jobs |

---

## 🚀 **CÁCH SỬ DỤNG**

### **BƯỚC 1: Start Queue Worker**

```bash
# Terminal 1: Start queue worker (BẮT BUỘC!)
cd /var/www/html/lms_hocmai
php artisan queue:work --timeout=90000

# Hoặc dùng supervisor (Production - RECOMMENDED)
# Xem phần Setup Supervisor bên dưới
```

⚠️ **LƯU Ý:** Queue worker PHẢI chạy, nếu không job sẽ không execute!

---

### **BƯỚC 2: Start Import Job**

```bash
# Import TẤT CẢ 380K questions
curl -X POST 'https://lmsnew.hocmai.net/api/speakup/import-background' \
  -H 'Content-Type: application/json' \
  -d '{
    "start_id": 1,
    "end_id": 380361,
    "batch_size": 100
  }'
```

**Response:**
```json
{
    "status": true,
    "message": "Background import job started",
    "data": {
        "job_id": "import_20251114032644_abc12345",
        "start_id": 1,
        "end_id": 380361,
        "total_questions": 380361,
        "batch_size": 100,
        "check_progress_url": "https://.../api/speakup/import-progress/import_20251114032644_abc12345"
    }
}
```

**Lưu lại `job_id`** để track progress!

---

### **BƯỚC 3: Track Progress (Realtime)**

```bash
# Check progress
curl 'https://lmsnew.hocmai.net/api/speakup/import-progress/import_20251114032644_abc12345'
```

**Response:**
```json
{
    "status": true,
    "data": {
        "job_id": "import_20251114032644_abc12345",
        "status": "running",
        "progress": {
            "total": 380361,
            "processed": 50000,
            "remaining": 330361,
            "percentage": 13.15,
            "current_id": 50000
        },
        "stats": {
            "imported": 48500,
            "skipped": 0,
            "not_found": 1200,
            "errors": 300,
            "success_rate": "97.00%"
        },
        "time": {
            "started_at": "2025-11-14T10:00:00.000000Z",
            "completed_at": null,
            "elapsed_seconds": 3600,
            "eta_seconds": 23800,
            "eta_formatted": "06:36:40"
        },
        "error_message": null
    }
}
```

**Giải thích:**
- Progress: 13.15% hoàn thành
- Imported: 48,500 questions thành công
- Not found: 1,200 questions không tồn tại (ID gaps)
- Errors: 300 questions lỗi thật sự
- ETA: Còn ~6.6 giờ nữa

---

### **BƯỚC 4: Monitor realtime (Optional)**

```bash
# Auto refresh mỗi 10 giây
watch -n 10 'curl -s "http://lmsnew.hocmai.net/api/speakup/import-progress/JOB_ID"'
```

Hoặc tạo script:

```bash
#!/bin/bash
# monitor_import.sh

JOB_ID="import_20251114032644_abc12345"

while true; do
    clear
    echo "=== Import Progress ==="
    curl -s "http://lmsnew.hocmai.net/api/speakup/import-progress/$JOB_ID" | jq
    
    # Check if completed
    STATUS=$(curl -s "http://lmsnew.hocmai.net/api/speakup/import-progress/$JOB_ID" | jq -r '.data.status')
    
    if [ "$STATUS" == "completed" ] || [ "$STATUS" == "failed" ]; then
        echo "Job finished with status: $STATUS"
        break
    fi
    
    sleep 10
done
```

---

### **BƯỚC 5: Nếu cần Cancel**

```bash
curl -X POST 'https://lmsnew.hocmai.net/api/speakup/import-cancel/import_20251114032644_abc12345'
```

**Response:**
```json
{
    "status": true,
    "message": "Job cancelled successfully",
    "data": {
        "job_id": "import_20251114032644_abc12345",
        "imported": 48500,
        "progress": "13.15%"
    }
}
```

Job sẽ dừng lại sau khi xử lý xong question hiện tại.

---

## ⚙️ **SETUP QUEUE WORKER**

### **Development (Manual):**

```bash
# Terminal riêng, chạy liên tục
php artisan queue:work --timeout=90000
```

---

### **Production (Supervisor - RECOMMENDED):**

#### **1. Install Supervisor:**

```bash
sudo apt-get install supervisor
```

#### **2. Create config:**

```bash
sudo nano /etc/supervisor/conf.d/lms-queue-worker.conf
```

**Nội dung:**

```ini
[program:lms-queue-worker]
process_name=%(program_name)s_%(process_num)02d
command=php /var/www/html/lms_hocmai/artisan queue:work --sleep=3 --tries=1 --timeout=90000
autostart=true
autorestart=true
stopasgroup=true
killasgroup=true
user=www-data
numprocs=2
redirect_stderr=true
stdout_logfile=/var/www/html/lms_hocmai/storage/logs/queue-worker.log
stopwaitsecs=3600
```

#### **3. Start supervisor:**

```bash
sudo supervisorctl reread
sudo supervisorctl update
sudo supervisorctl start lms-queue-worker:*
```

#### **4. Check status:**

```bash
sudo supervisorctl status lms-queue-worker:*
```

---

## 📊 **TÍNH TOÁN THỜI GIAN**

### **Ước tính cho 380K questions:**

```
Total questions: 380,361

Thời gian trung bình: ~2 seconds/question
- EMS API call: 0.5s
- Save to DB: 0.1s
- Sleep (rate limiting): 0.01s per question
- Network overhead: 0.5s

Total time: 380,361 × 2s = 760,722s = 211 giờ = 8.8 ngày

Với parallel (2 workers):
→ ~4.4 ngày

Thực tế (với not_found skip nhanh hơn):
→ ~3-4 ngày

Có thể tối ưu:
- Tăng workers: 4 workers → 2 ngày
- Giảm sleep time: → 1.5 ngày
```

---

## 🎯 **BEST PRACTICES**

### **1. Chia nhỏ thành nhiều jobs:**

```bash
# Thay vì 1 job lớn (380K):
curl -X POST '/api/speakup/import-background' \
  -d '{"start_id": 1, "end_id": 380361}'

# Chia thành nhiều jobs nhỏ (RECOMMENDED):
# Job 1: ID 1-100,000
curl -X POST '/api/speakup/import-background' \
  -d '{"start_id": 1, "end_id": 100000}'

# Job 2: ID 100,001-200,000
curl -X POST '/api/speakup/import-background' \
  -d '{"start_id": 100001, "end_id": 200000}'

# Job 3: ID 200,001-300,000
curl -X POST '/api/speakup/import-background' \
  -d '{"start_id": 200001, "end_id": 300000}'

# Job 4: ID 300,001-380,361
curl -X POST '/api/speakup/import-background' \
  -d '{"start_id": 300001, "end_id": 380361}'
```

**Ưu điểm:**
- ✅ Dễ monitor
- ✅ Dễ retry nếu fail
- ✅ Có thể chạy parallel
- ✅ Cancel được từng job

---

### **2. Monitor progress:**

```bash
# Check mỗi 1 phút
watch -n 60 'curl -s "http://lmsnew.hocmai.net/api/speakup/import-progress/JOB_ID" | jq ".data.progress"'
```

---

### **3. Check logs:**

```bash
# Queue worker logs
tail -f storage/logs/queue-worker.log

# Laravel logs
tail -f storage/logs/laravel.log | grep "Import job"
```

---

## 📊 **EXAMPLE PROGRESS TRACKING**

### **Job vừa bắt đầu (0.1%):**

```json
{
    "status": "running",
    "progress": {
        "percentage": 0.1,
        "processed": 380,
        "remaining": 379981
    },
    "stats": {
        "imported": 365,
        "not_found": 12,
        "errors": 3
    },
    "time": {
        "elapsed_seconds": 760,
        "eta_seconds": 759240,
        "eta_formatted": "210:54:00"
    }
}
```

---

### **Job đang chạy (50%):**

```json
{
    "status": "running",
    "progress": {
        "percentage": 50.00,
        "processed": 190180,
        "remaining": 190181
    },
    "stats": {
        "imported": 184500,
        "not_found": 800,
        "errors": 150
    },
    "time": {
        "elapsed_seconds": 380360,
        "eta_seconds": 380360,
        "eta_formatted": "105:39:20"
    }
}
```

---

### **Job hoàn thành (100%):**

```json
{
    "status": "completed",
    "progress": {
        "percentage": 100.00,
        "processed": 380361,
        "remaining": 0
    },
    "stats": {
        "imported": 368000,
        "skipped": 0,
        "not_found": 1611,
        "errors": 750,
        "success_rate": "96.77%"
    },
    "time": {
        "started_at": "2025-11-14T10:00:00Z",
        "completed_at": "2025-11-17T18:30:00Z",
        "elapsed_seconds": 288600,
        "eta_seconds": 0
    }
}
```

**Giải thích:**
- Import 368,000 questions thành công (96.77%)
- 1,611 questions không tồn tại (ID gaps)
- 750 questions lỗi (network issues, có thể retry)
- Tổng thời gian: 80 giờ (3.3 ngày)

---

## 🔧 **TROUBLESHOOTING**

### **Lỗi 1: "Queue worker not running"**

**Triệu chứng:** Job không chạy, status = pending mãi

**Giải pháp:**
```bash
# Check queue worker
ps aux | grep "queue:work"

# Nếu không có, start:
php artisan queue:work --timeout=90000 &
```

---

### **Lỗi 2: "There is already a running import job"**

**Triệu chứng:** Không start được job mới

**Giải pháp:**
```bash
# Check job đang chạy
curl 'http://lmsnew.hocmai.net/api/speakup/import-jobs'

# Nếu muốn cancel:
curl -X POST 'http://lmsnew.hocmai.net/api/speakup/import-cancel/JOB_ID'
```

---

### **Lỗi 3: Job failed**

**Triệu chứng:** Status = failed

**Giải pháp:**
```bash
# Check logs
tail -100 storage/logs/laravel.log | grep "Import job failed"

# Retry from current_id
curl -X POST '/api/speakup/import-background' \
  -d '{"start_id": CURRENT_ID, "end_id": 380361}'
```

---

## 📋 **WORKFLOW ĐỂ IMPORT 380K**

### **Option A: 1 Job lớn (Đơn giản)** ⭐⭐⭐

```bash
# 1. Start queue worker
php artisan queue:work --timeout=90000 &

# 2. Start import
curl -X POST 'https://lmsnew.hocmai.net/api/speakup/import-background' \
  -H 'Content-Type: application/json' \
  -d '{
    "start_id": 1,
    "end_id": 380361,
    "batch_size": 100
  }'

# 3. Lưu job_id từ response

# 4. Monitor
watch -n 60 'curl -s "https://lmsnew.hocmai.net/api/speakup/import-progress/JOB_ID"'

# 5. Đợi 3-4 ngày để hoàn thành
```

**Ưu điểm:**
- Đơn giản nhất
- 1 request duy nhất

**Nhược điểm:**
- Nếu fail giữa chừng, mất thời gian
- Khó retry

---

### **Option B: 4 Jobs song song (Nhanh hơn)** ⭐⭐⭐⭐⭐ **RECOMMENDED**

```bash
# 1. Start 4 queue workers (4 terminals)
php artisan queue:work --timeout=90000 &
php artisan queue:work --timeout=90000 &
php artisan queue:work --timeout=90000 &
php artisan queue:work --timeout=90000 &

# 2. Start 4 jobs parallel:

# Job 1: 1-95,000
curl -X POST 'https://lmsnew.hocmai.net/api/speakup/import-background' \
  -d '{"start_id": 1, "end_id": 95000}'

# Job 2: 95,001-190,000
curl -X POST 'https://lmsnew.hocmai.net/api/speakup/import-background' \
  -d '{"start_id": 95001, "end_id": 190000}'

# Job 3: 190,001-285,000
curl -X POST 'https://lmsnew.hocmai.net/api/speakup/import-background' \
  -d '{"start_id": 190001, "end_id": 285000}'

# Job 4: 285,001-380,361
curl -X POST 'https://lmsnew.hocmai.net/api/speakup/import-background' \
  -d '{"start_id": 285001, "end_id": 380361}'

# 3. Check all jobs
curl 'https://lmsnew.hocmai.net/api/speakup/import-jobs'
```

**Ưu điểm:**
- ✅ Nhanh gấp 4 lần (~1 ngày thay vì 4 ngày)
- ✅ Nếu 1 job fail, 3 job khác vẫn chạy
- ✅ Dễ retry từng phần

**Nhược điểm:**
- Cần 4 queue workers
- Tốn tài nguyên hơn

---

## 📊 **SUPERVISOR CONFIG (Production)**

### **File: `/etc/supervisor/conf.d/lms-queue-worker.conf`**

```ini
[program:lms-queue-worker]
process_name=%(program_name)s_%(process_num)02d
command=php /var/www/html/lms_hocmai/artisan queue:work --sleep=3 --tries=1 --timeout=90000
autostart=true
autorestart=true
stopasgroup=true
killasgroup=true
user=www-data
numprocs=4
redirect_stderr=true
stdout_logfile=/var/www/html/lms_hocmai/storage/logs/queue-worker.log
stopwaitsecs=3600
```

**Start supervisor:**
```bash
sudo supervisorctl reread
sudo supervisorctl update
sudo supervisorctl start lms-queue-worker:*

# Check status
sudo supervisorctl status
```

---

## 🎉 **SUMMARY**

### **ĐỂ IMPORT TẤT CẢ 380K QUESTIONS:**

```bash
# 1. Start queue worker (REQUIRED!)
php artisan queue:work --timeout=90000 &

# 2. Start import
curl -X POST 'https://lmsnew.hocmai.net/api/speakup/import-background' \
  -H 'Content-Type: application/json' \
  -d '{
    "start_id": 1,
    "end_id": 380361
  }'

# 3. Get job_id từ response

# 4. Track progress
curl 'https://lmsnew.hocmai.net/api/speakup/import-progress/JOB_ID'

# 5. Wait ~3-4 days

# 6. Check final stats
curl 'https://lmsnew.hocmai.net/api/speakup/import-stats'
```

---

## ✅ **ADVANTAGES**

```
✅ Không timeout (chạy background)
✅ Track progress realtime
✅ Có thể cancel bất cứ lúc nào
✅ Auto retry (nếu config)
✅ Scale với multiple workers
✅ Professional solution
✅ Production ready
```

---

## 🚀 **READY TO USE!**

**Migrations đã chạy:**
- ✅ `ems_questions` table created
- ✅ `ems_import_progress` table created

**Bước tiếp theo:**
1. Start queue worker
2. Start import job
3. Monitor progress
4. Wait for completion

**Thời gian ước tính:** 3-4 ngày (1 worker) hoặc 1 ngày (4 workers)

🎉 **BẮT ĐẦU IMPORT NGAY!**

