trying parse different url, but failed

This commit is contained in:
Julian Freeman
2025-12-02 10:29:39 -04:00
parent ee4880a83b
commit 068bc1c79e
5 changed files with 219 additions and 24 deletions

111
spec/url_parsing.md Normal file
View File

@@ -0,0 +1,111 @@
# Feature Request: Advanced Playlist Parsing & Mix Handling
## Context
We are upgrading the **StreamCapture** application. Currently, the app only supports parsing standard single video URLs.
We need to implement robust support for **Standard Playlists** and **YouTube Mixes (Radio)**, while solving performance issues and thumbnail loading errors.
## Problem Statement
1. **Playlists (`/playlist?list=...`)**: Fails to parse entries or hangs because it attempts to resolve full metadata for every video. Thumbnails are missing.
2. **Mixes (`/watch?v=...&list=RD...`)**: Currently fails or behaves unpredictably.
3. **UI/UX**: Users need a specific choice when encountering a "Mix" link:
* **Option A (Default):** Treat it as a single video (ignore the list).
* **Option B:** Parse the Mix as a playlist (limit to top 20 items).
## Technical Requirements
### 1. Rust Backend Refactoring (`src-tauri/src/ytdlp.rs`)
Refactor the `fetch_metadata` command to use a unified, efficient parsing strategy.
#### A. Command Construction
For **ALL** metadata fetching, use the `--flat-playlist` flag to prevent deep extraction (which causes the hang).
**Base Command:**
```bash
yt-dlp --dump-single-json --flat-playlist --no-warnings [URL]
```
#### B. Handling Different URL Types
1. Single Video:
- `yt-dlp` returns a single JSON object with `_type: "video"` (or no type).
- Action: Wrap it in a list of 1.
2. Standard Playlist:
- `yt-dlp` returns `_type: "playlist"` with an `entries` array.
- Action: Map the `entries` to our `Video` struct.
3. Mix / Radio (Infinite List):
- Condition: If the frontend flags this request as a "Mix Playlist scan".
- Modification: Add flag --playlist-end 20 to the command.
- Reason: Mixes are infinite; we must cap them.
#### C. Data Normalization & Thumbnail Fallback
When using `--flat-playlist`, the `entries` often lack full `thumbnail` URLs or return webp formats that might not render immediately.
- Logic: If the `thumbnail` field is missing or empty in an entry, construct it manually using the ID:
- https://i.ytimg.com/vi/{video_id}/mqdefault.jpg
### 2. Frontend Logic (`src/views/Home.vue` & Stores)
#### A. URL Detection & User Choice
Before sending the URL to Rust, analyze the string:
1. Regex Check: Detect if the URL contains both `v=` AND `list=` (typical for Mixes).
2. New UI Element:
- If a Mix link is detected, show a **Checkbox** or **Toggle** near the "Analyze" button.
- Label: "Scan Playlist (Max 20)"
- Default State: Unchecked (Off).
3. Submission Logic:
- If Unchecked (Default): Strip the `&list=...` parameter from the URL string before calling Rust. Treat it as a pure single video.
- If Checked: Keep the full URL and pass a flag (e.g., `is_mix: true`) to the Rust backend (or handle the logic to request the top 20).
#### B. Displaying Results
- Ensure the "Selection Area" can render a list of cards whether it contains 1 video or 20 videos.
- If it's a playlist, show a "Select All" / "Deselect All" control.
## Implementation Plan
### Step 1: Rust Structs Update
Update the Serde structs in `ytdlp.rs` to handle the `flat-playlist` JSON structure.
```rust
// Example structure hint
struct YtDlpResponse {
_type: Option<String>,
entries: Option<Vec<VideoEntry>>,
// ... fields for single video fallback
id: Option<String>,
title: Option<String>,
}
struct VideoEntry {
id: String,
title: String,
duration: Option<f64>,
thumbnail: Option<String>, // Might be missing
}
```
### Step 2: Rust Logic Update
Modify `command::fetch_metadata`.
- Accept an optional argument `parse_mix_playlist: bool`.
- If `true`, append `--playlist-end 20`.
- Implement the thumbnail fallback logic (if `thumbnail` is None, use `i.ytimg.com`).
### Step 3: Frontend Update
- Add the "Mix Detected" logic in `Home.vue`.
- Add the toggle UI.
- Update the `analyze` function to handle URL stripping vs. passing through based on the toggle.
## Deliverables
Please rewrite the necessary parts of `src-tauri/src/ytdlp.rs`, `src-tauri/src/commands.rs`, and `src/views/Home.vue` to implement this logic.

View File

@@ -30,8 +30,8 @@ pub fn get_ytdlp_version(app: AppHandle) -> Result<String, String> {
}
#[tauri::command]
pub async fn fetch_metadata(app: AppHandle, url: String) -> Result<downloader::MetadataResult, String> {
downloader::fetch_metadata(&app, &url).await.map_err(|e| e.to_string())
pub async fn fetch_metadata(app: AppHandle, url: String, parse_mix_playlist: bool) -> Result<downloader::MetadataResult, String> {
downloader::fetch_metadata(&app, &url, parse_mix_playlist).await.map_err(|e| e.to_string())
}
#[tauri::command]

View File

@@ -46,18 +46,22 @@ pub struct ProgressEvent {
pub status: String, // "downloading", "processing", "finished", "error"
}
pub async fn fetch_metadata(app: &AppHandle, url: &str) -> Result<MetadataResult> {
pub async fn fetch_metadata(app: &AppHandle, url: &str, parse_mix_playlist: bool) -> Result<MetadataResult> {
let ytdlp_path = ytdlp::get_ytdlp_path(app)?;
// Use std::process for simple output capture if it's short, but tokio is safer for async.
let output = Command::new(ytdlp_path)
.arg("--dump-single-json")
let mut cmd = Command::new(ytdlp_path);
cmd.arg("--dump-single-json")
.arg("--flat-playlist")
.arg(url)
// Stop errors from cluttering
.stderr(Stdio::piped())
.output()
.await?;
.arg("--no-warnings");
if parse_mix_playlist {
cmd.arg("--playlist-end").arg("20");
}
cmd.arg(url);
cmd.stderr(Stdio::piped());
let output = cmd.output().await?;
if !output.status.success() {
let stderr = String::from_utf8_lossy(&output.stderr);
@@ -90,10 +94,18 @@ pub async fn fetch_metadata(app: &AppHandle, url: &str) -> Result<MetadataResult
}
fn parse_video_metadata(json: &serde_json::Value) -> VideoMetadata {
let id = json["id"].as_str().unwrap_or("").to_string();
// Thumbnail fallback logic
let thumbnail = match json.get("thumbnail").and_then(|t| t.as_str()) {
Some(t) if !t.is_empty() => t.to_string(),
_ => format!("https://i.ytimg.com/vi/{}/mqdefault.jpg", id),
};
VideoMetadata {
id: json["id"].as_str().unwrap_or("").to_string(),
id,
title: json["title"].as_str().unwrap_or("Unknown Title").to_string(),
thumbnail: json["thumbnail"].as_str().unwrap_or("").to_string(), // Note: thumbnails might be an array sometimes, usually string in flat-playlist
thumbnail,
duration: json["duration"].as_f64(),
uploader: json["uploader"].as_str().map(|s| s.to_string()),
}

View File

@@ -8,6 +8,10 @@ export const useAnalysisStore = defineStore('analysis', () => {
const error = ref('')
const metadata = ref<any>(null)
// New state for mix detection
const isMix = ref(false)
const scanMix = ref(false)
const options = ref({
is_audio_only: false,
quality: 'best',
@@ -19,10 +23,9 @@ export const useAnalysisStore = defineStore('analysis', () => {
loading.value = false
error.value = ''
metadata.value = null
// We keep options as is, or reset them?
// Usually keeping user preference for "Audio Only" is nice,
// but let's just reset the content-related stuff.
isMix.value = false
scanMix.value = false
}
return { url, loading, error, metadata, options, reset }
return { url, loading, error, metadata, options, isMix, scanMix, reset }
})

View File

@@ -26,6 +26,17 @@ watch(() => settingsStore.settings.download_path, (newPath) => {
}
}, { immediate: true })
// Detect Mix URL
watch(() => analysisStore.url, (newUrl) => {
if (newUrl && newUrl.includes('v=') && newUrl.includes('list=')) {
analysisStore.isMix = true
} else {
analysisStore.isMix = false
// Reset scanMix if URL changes to non-mix
analysisStore.scanMix = false
}
})
async function analyze() {
if (!analysisStore.url) return
analysisStore.loading = true
@@ -33,7 +44,29 @@ async function analyze() {
analysisStore.metadata = null
try {
const res = await invoke('fetch_metadata', { url: analysisStore.url })
let urlToScan = analysisStore.url;
let parseMix = false;
if (analysisStore.isMix) {
if (analysisStore.scanMix) {
// Keep URL as is, tell backend to limit scan
parseMix = true;
} else {
// Strip list param
try {
const u = new URL(urlToScan);
u.searchParams.delete('list');
u.searchParams.delete('index');
u.searchParams.delete('start_radio');
urlToScan = u.toString();
} catch (e) {
// Fallback regex if URL parsing fails
urlToScan = urlToScan.replace(/&list=[^&]+/, '');
}
}
}
const res = await invoke('fetch_metadata', { url: urlToScan, parseMixPlaylist: parseMix })
analysisStore.metadata = res
} catch (e: any) {
analysisStore.error = e.toString()
@@ -55,8 +88,28 @@ async function startDownload() {
{ title: analysisStore.metadata.title, thumbnail: "", id: analysisStore.metadata.id } :
analysisStore.metadata;
// Note: We might want to pass the *cleaned* URL if it was cleaned during analyze
// But for now we pass the original URL or whatever was scanned.
// Actually, if we scanned as a single video (unchecked), we should probably download as single video.
// The user might expect the same result as analysis.
// Let's reconstruct the URL logic or just use what `analyze` used?
// Since `start_download` just takes a URL string, we should probably use the same logic.
let urlToDownload = analysisStore.url;
if (analysisStore.isMix && !analysisStore.scanMix) {
try {
const u = new URL(urlToDownload);
u.searchParams.delete('list');
u.searchParams.delete('index');
u.searchParams.delete('start_radio');
urlToDownload = u.toString();
} catch (e) {
urlToDownload = urlToDownload.replace(/&list=[^&]+/, '');
}
}
const id = await invoke<string>('start_download', {
url: analysisStore.url,
url: urlToDownload,
options: analysisStore.options,
metadata: metaToSend
})
@@ -104,6 +157,22 @@ async function startDownload() {
<span v-else>Analyze</span>
</button>
</div>
<!-- Mix Toggle -->
<div v-if="analysisStore.isMix" class="mt-4 flex items-center gap-3">
<button
@click="analysisStore.scanMix = !analysisStore.scanMix"
class="w-12 h-6 rounded-full relative transition-colors duration-200 ease-in-out flex-shrink-0"
:class="analysisStore.scanMix ? 'bg-blue-600' : 'bg-gray-300 dark:bg-zinc-600'"
>
<span
class="absolute top-1 left-1 w-4 h-4 bg-white rounded-full transition-transform duration-200"
:class="analysisStore.scanMix ? 'translate-x-6' : 'translate-x-0'"
/>
</button>
<span class="text-sm font-medium text-zinc-700 dark:text-gray-300">Scan Playlist (Max 20)</span>
</div>
<p v-if="analysisStore.error" class="mt-3 text-red-500 text-sm">{{ analysisStore.error }}</p>
</div>