Add interactive REPL mode + fix screenshot crawl#19
Open
sahilsunny wants to merge 15 commits into
Open
Conversation
Bare `scrapingbee` (no subcommand) now drops into a themed REPL with tab completion, history, and inline command help. Adds: - src/scrapingbee_cli/interactive.py — REPL loop, splash, completer - src/scrapingbee_cli/theme.py — ScrapingBee brand theme + spinner - src/scrapingbee_cli/help_formatter.py — Rich-styled click help - pyproject.toml: prompt_toolkit>=3.0, rich>=13.0 Hooks `cli.py` so the click group is invoke_without_command=True and falls into run_repl() when no subcommand is given. Schedule hint is suppressed inside the REPL to avoid per-command noise. Phase 2 (theme integration in command files for inline spinners during command runs) will follow as a separate commit.
Inside the REPL, commands now show a MiniBeeSpinner during the
API call, batches show a live honeycomb credit meter via
LiveCreditTracker, and verbose / completion output is rendered
with rich-styled helpers from theme.py.
Changes:
- batch.py: new _batch_done helper; honeycomb-trail progress;
styled batch-start banner; LiveCreditTracker wrap around the
batch run; usage_info kwarg on run_api_batch
- cli_utils.py: REPL-mode branches in _validate_range,
check_api_response, scrape_with_escalation, and write_output
verbose section
- client.py: parse_usage now exposes max_api_credit (needed by
LiveCreditTracker)
- commands/{amazon,chatgpt,fast_search,google,walmart,youtube}.py:
MiniBeeSpinner around single API calls; usage_info pass-through
to run_api_batch
- commands/scrape.py: spinner around single scrape; LiveCreditTracker
around batch; REPL-styled error on HTTP 4xx/5xx
- commands/usage.py: full styled dashboard (honeycomb meter,
credits used/remaining/total, concurrency, renewal date) when
invoked from the REPL; plain JSON kept for non-REPL
- commands/crawl.py: LiveCreditTracker wrap around run_urls_spider
so credit drain during long crawls is visible
Plain (non-REPL) output is unchanged for every code path. All 653
existing unit tests still pass.
Bug fix: remove the outer REPL spinner that wrapped every command. It blocked interactive commands (`tutorial`, `auth`) from prompting the user, masked their output, and double-stacked with the inner MiniBeeSpinner already added in Phase 2 for network commands. Add `tutorial` and `unsafe` to the REPL command list and tab completion (introduced in v1.4.0/v1.4.1, were missing). Prompt: drop the Powerline-arrow protrusion in favour of a single unified yellow tag — ` ScrapingBee ❯ ` — with the chevron inside the tag. Renders identically in every terminal/font (Mac Terminal, Warp, iTerm2, etc.) since it uses only standard BMP glyphs. Set SCRAPINGBEE_POWERLINE=1 to opt back into the Powerline arrow if you have a patched font (Nerd Font / Powerline-patched).
Treat the REPL as a tool, not a mascot. The previous version
prioritised personality (splash, ASCII logo, bee emoticons,
rotating fun facts, cute exit) over getting out of the user's
way. This rewrite swaps that for psql/redis-cli/gh-style
density and consistency.
interactive.py — full rewrite:
- Remove the bee splash animation, ASCII-art logos, repeated
hint line on every prompt.
- One-line banner on startup, then prompt.
- Slash-prefixed REPL meta-commands (`:help`, `:q`, `:clear`,
`:set`, `:unset`, `:show`) so they don't collide with click
commands. Bare aliases (`help`, `exit`, `quit`, `q`, `clear`)
still work for muscle memory.
- Per-command tab completion driven by walking the click tree
at startup — `youtube-search --<TAB>` now shows YouTube flags,
`scrape --<TAB>` shows scrape flags. Bool/Choice flags auto-
detected from click param types (no more flat `_COMMON_FLAGS`
list that drifts from reality).
- Uniform output frame around every command: `─── cmd ─── ` divider
on top, `[ok]/[fail] 1.23s` line on the bottom.
- Bottom toolbar with live state: credits remaining (read from
the existing usage cache), last command name + status + duration,
active session settings.
- "Did you mean?" suggestions on unknown commands and on click
"no such option" errors (Levenshtein distance, threshold 2).
- Multi-line input via trailing backslash continuation.
- Session settings via `:set country-code=fr`, applied as default
flags to subsequent commands when not explicitly overridden.
- `:clear` uses standard `\033[2J\033[H` instead of the previous
scroll-and-jump heuristic.
- Silent exit (no "Buzz off!" message).
theme.py:
- Replace MiniBeeSpinner's emoticon flap frames + rotating "Bee
facts" + time-of-day flavour messages with a single line:
ten braille-dot frames + the command name. Same API
(`with MiniBeeSpinner("scrape"):`) — call sites unchanged.
- Drop dead module-level state: MESSAGES, _BEE_FACTS,
_MSG_ROTATE_TICKS, _time_flavor.
All 653 unit tests still pass. SCRAPINGBEE_POWERLINE=1 still
opts into the protruding Powerline arrow for users with patched
fonts.
…eview
This is the working "non-TUI" iteration before switching to a true
full-screen TUI. Captures every fix in this round — keep it as a
checkpoint to fall back to if the TUI rewrite needs to be reverted.
interactive.py:
- Bordered input: dropped the Frame widget (rendering artifacts) and
the horizontal rules (yellow trails on resize) — input is now just
a chevron prompt + lexer-highlighted buffer + adaptive bottom toolbar.
- Tab completion: re-bind Tab/Shift-Tab/Esc on the custom KeyBindings
(the previous version overrode prompt_toolkit defaults).
- erase_when_done=True on the Application + manual `❯ <cmd>` echo into
scrollback after submit — fewer stale-render artifacts on resize.
- :set overhaul: validate keys against the click flag list, accept
"k=v ..." and "--k v ..." mixed forms, suggest on typo, validate
choice/bool values where known.
- :unset accepts space- or comma-separated keys; :unset *, :unset all,
:reset all clear every setting.
- :view slash command — cross-platform pager built on prompt_toolkit
(no `less` dependency on Windows). Arrow keys / PgUp/PgDn / Home/End
/ mouse wheel to scroll, q / Esc to exit.
- Toolbar adapts to width: chips truncate to "+N more" when narrow.
- Per-command tab completion driven by walking the click tree (already
in the previous commit, retained).
theme.py:
- Hex bloom spinner: 3-cell radial composition (centre + halo) so the
bloom radiates symmetrically instead of growing rightward. Frames
cycle dust → speck → outline → honeycomb → ✦ sparkle peak → drain,
paired with a dim→bright→warm colour gradient.
- White-glim shimmer sweeps across the verb ("Fetching", "Rendering")
in time with the bloom.
- Elapsed-time counter once an op runs > 0.5s.
- Per-command verb rotation (no bee facts).
cli_utils.py:
- Output preview in REPL mode: large text dumps (>30 lines OR >4 KB)
get truncated to a 30-line / 4 KB preview. Single-line minified HTML
is detected by byte threshold so it doesn't slip through.
- Full payload auto-saved to ~/.cache/scrapingbee-cli/last-output so
the user can :view / cat / less it.
- Binary output (PNG, PDF, etc.) is never truncated.
- Non-REPL invocations are unchanged so pipes/redirects keep working.
All 653 unit tests still pass.
Switches the REPL from prompt_toolkit's full_screen=False (inline) mode to full_screen=True with an in-memory ScrollbackBuffer. Eliminates the wrap-fragment / orphan-toolbar artifacts that bled into terminal scrollback on resize, and gives us full control over rendering for shimmer animations, mouse handling, and pagination. Layout - Pinned banner Window at the top: compact smblock "ScrapingBee" + version + tagline + ":help / :q" hint. Stays visible during long scrapes. - Scrollback Window below the banner; spacer rows + horizontal separator between scrollback and the input area (Claude-CLI style). - Toolbar at the bottom with paginated fields that rotate every 5s (Available Credits / Used Session / Concurrency / Next Update); the mode hint is pinned on every page so it's always visible. - Running-state toolbar pins "running · Xs" on the left, rotates a stat in the middle (so credits consumed are visible during long crawls), and pins "Ctrl+C to stop" on the right. Output handling - ScrollbackBuffer + ScrollbackWriter pipe stdout / stderr / err_console through ANSI-parsing into an in-memory line list rendered by a scrollable Window. 10K-line ring buffer. - Visual-row scroll (not logical-line): scroll_offset measured in terminal rows with width-aware line splitting, so long single-line output (huge JSON, etc.) scrolls one terminal row per wheel tick. - Command echo splices into scrollback at the position where output started, on completion — no echo during execution (only the shimmer is the live indicator), echo appears right above output when done. Input / interaction - Mouse mode 1000 captures wheel/trackpad scroll; native drag-select still works because the terminal owns motion events. Tab toggles Scroll vs Select mode at runtime; toolbar hint shows current mode. - Up/Down arrow keys navigate command history; explicit history.store_string() per submit since the custom Enter binding bypasses Buffer.validate_and_handle(). - Tab completion opens a popup via FloatContainer + CompletionsMenu (was silently entering completion state with no UI). Up/Down navigate, Enter picks, Esc dismisses. - Pager (:view) wraps long lines, defaults to pretty-printed JSON with "r" to toggle raw, runs in a worker thread to avoid asyncio.run() conflict with the outer loop, re-enters alt buffer on exit so the outer REPL doesn't bleed into the main screen. - Resize detection in the ticker triggers app.invalidate() so the layout adapts cleanly. State / usage - SessionState gains api_key_hash + per-session "used_credits_at_start" so re-auth with the same key preserves the session counter; a different key resets it. - Background usage refresher polls /usage every 30s; "usage" command completion + auth completion trigger an immediate refresh via a thread-safe event. - Banner shows "API key not set — type auth" when no key is configured. - :help wrapping with a proper hanging indent (Text objects, not Rich markup, so leading whitespace is preserved); blank row between categories. Crawl - Skip Twisted signal-handler installation in REPL mode (signal.signal requires the main thread, but commands run in worker threads). - Wire LOG_FILE to ~/.cache/scrapingbee-cli/crawl.log in REPL mode so the full crawl log is preserved beyond scrollback's MAX_LINES. - Initialise usage_info to None before the batch-usage try block to prevent UnboundLocalError when the initial fetch raises. Misc - cli_utils: always overwrite the last-output cache for text responses in REPL mode (not just truncated ones) so :view never shows stale output from a previous command. - Ctrl+C while running injects KeyboardInterrupt into the worker via PyThreadState_SetAsyncExc; surfaces as "stopped" in the footer. - Reverted earlier experiments with Braille / PIL-rendered logos.
History navigation
- _submit now calls ``input_buffer.reset()`` instead of
``set_document(Document(""))`` so the history-navigation cursor
(``working_index``) is also reset. Without this, after submitting a
command the next Up press could continue browsing from wherever the
user had last left off in history.
- Up handler synchronously loads history strings into ``_working_lines``
when the buffer is fresh (len == 1). prompt_toolkit's
``load_history_if_not_yet_loaded`` schedules an *async* task that
doesn't run before the first keypress, so without this the first Up
after submit was a no-op and required two presses.
- Up handler also jumps ``working_index`` to the end when the buffer is
empty after browsing, so Up restarts from the newest entry rather than
walking further back from the previous browse position.
Esc latency
- Drop ``ttimeoutlen`` (parser-level escape-sequence wait, default 0.5s)
to 0.05s on both the main REPL Application and the :view pager
Application. Modern terminals deliver escape sequences as one read so
50ms is plenty.
- Drop ``timeoutlen`` (key-processor multi-key-binding wait, default
1.0s) to 0.05s on the pager — this was the main culprit behind the
2-3 second Esc delay there.
- Bind ``escape`` in the pager with ``eager=True`` so it fires the
moment the key processor sees it, bypassing partial-match search.
Both attributes are set on the Application instance after construction
because they aren't constructor parameters in this prompt_toolkit
version (passing them to __init__ raises TypeError).
…, fast Ctrl+C - API key entry now lives inside the REPL UI: prompt flips to `API key:` with a masked input on startup or after `logout` / `auth`. No more pre-app getpass; no more `run_in_terminal` suspend/resume jolt. - `!cmd` runs a shell command in a worker thread, gated by the existing unsafe-mode check. Output streams into scrollback; Ctrl+C terminates the child. - Ctrl+C during a scrape stops in a frame instead of waiting for the HTTP request: tracks the worker's asyncio loop via a monkey-patched `asyncio.run` and cancels in-flight tasks via `call_soon_threadsafe`. CancelledError is caught alongside KeyboardInterrupt. - Submitted command stays in the buffer if the run fails or is cancelled; only successful runs clear it. - Batch progress: brand-yellow honeycomb hexes that fill as you go, with a shimmering boundary cell driven by the REPL's 10 Hz ticker. Single live-updating line via `replace_last_n_lines` instead of one appended row per completion. Usage credit meter mirrors the brand-yellow filled/outline palette. - `:view` now also accepts `:view crawl` (alias for the crawl log) and `:view <path>` for arbitrary files. Meta-command echo is spliced ABOVE the meta's output, matching click-command echo order. - History Up after submit no longer inverts oldest/newest order. - `_validate_api_key` detects a running loop and offloads to a worker thread so REPL-mode `auth` no longer hits "asyncio.run cannot be called from a running event loop".
Replaces the compact smblock SCRAPING + stacked BEE block (10 logo rows total) with a single 6-row "SCRAPING BEE" wordmark in ANSI Shadow — yellow SCRAPING beside white BEE, mirroring the brand wordmark. Same letterforms as the legacy logo, just stitched onto one line of text instead of two so the banner takes less vertical space. SCRAPING rows are now padded to a uniform 62-column width so BEE starts at the same column on every row. Without the padding G's natural shape leaves a trailing space on rows 1, 2, 6 only — that shifted BEE one column right on rows 3, 4, 5 and the bottom of B / last E read as misaligned.
Best-effort XTERM Window Manipulation ("CSI 8 ; H ; W t") to bump the
window to 100 cols × 30 rows when the current size is below that. Fits
the 90-col banner with room for the toolbar + input. Only fires when
the window is actually too small, so users on a large terminal aren't
disrupted. Apple Terminal.app and SSH / tmux sessions ignore the
sequence and the REPL silently proceeds.
Release covers the REPL overhaul series: in-place API key prompt, !shell exec, fast Ctrl+C cancellation, honeycomb batch progress with shimmering boundary cell, single-line SCRAPING BEE banner, terminal auto-resize at startup, and assorted polish.
…-flag consistency
Non-REPL changes (affect `scrapingbee crawl` and the CLI outside the REPL too):
crawl.py — screenshot crawl actually produces N files for --max-pages N
Pristine v1.4.1 with `--max-pages 5 --screenshot-full-page true` saved
only 1 PNG. Side-by-side test against pristine confirmed three stacked
upstream/spider bugs:
1. `_requires_discovery_phase` only checked `screenshot`, missing
`screenshot_full_page` and `screenshot_selector`. Those modes
silently fell into the same-mode `parse()` path that runs link
extraction on PNG bytes — yielding garbage URLs that crashed
on dispatch.
2. `scrapy_scrapingbee`'s default errback calls `response.text` on
binary 500 responses → `AttributeError` → killed the spider.
Every `ScrapingBeeRequest` is now wired to our `_on_request_error`
which logs the URL and continues.
3. The scheduler's LIFO ordering popped follow-discovery requests
before the save requests yielded alongside them. With ~100
follow URLs per page, saves were never dequeued before
`CLOSESPIDER_PAGECOUNT` bailed. Fix: `priority=10` on save
requests + raise `CloseSpider` from `_push_saved_status` when
`_save_count >= max_pages`, so the engine drops the rest of
the queue immediately.
crawl.py — pool-based discovery (binary / extract modes)
Old flow paid one HTML discovery + one save per saved page ≈ 2× credits.
New flow accumulates URLs into `_save_queue` while discovering; once
the pool reaches `max_pages` we flip `_discovery_done`, dispatch one
save per pooled URL in priority order, and stop discovering. For a
`--max-pages 100 --screenshot-full-page true` run on a link-rich site
that previously cost ~1000 credits, this is closer to ~510.
A `spider_idle` handler flushes the pool when the site is smaller
than the cap so small-site crawls still produce output.
crawl.py — `--max-pages N` now means N SAVED pages
Replaces the older `_fetch_count` cap with `_save_count` +
`_save_pending`. `--max-pages N` previously could stop early when
discovery requests counted against the cap; now it counts only
successful saves and matches the help text. Includes save-failure
backfill from the queue so flaky 5xx errors don't silently shrink
the user's effective budget.
crawl.py — `errback` + non-printable URL filter on every yielded request
`scrapy_scrapingbee`'s default errback is the binary-500 landmine
above. Our `_on_request_error` is now attached to every
`ScrapingBeeRequest` in the spider. Additionally, links whose
decoded path/query contains non-ASCII bytes (common when discovery
extracts hrefs from a corrupted PNG response on crawler-test.com
fixtures) are dropped at iteration time so they can't trip the
upstream errback in the first place.
commands/crawl.py — concurrency-warning shows the actual reason
Was: `Warning: could not check plan concurrency. Defaulting to 1…`
Now: `Warning: could not check plan concurrency (HTTP 429). Defaulting…`
The `/usage` endpoint is rate-limited; without the reason, users
couldn't distinguish a transient 429 from a real auth/network
problem and would default to concurrency=1 unnecessarily.
cli.py + cli_utils.py — `--flag true|false` accepted for every bool flag
Scraping-side options (`--render-js true`, `--premium-proxy true`, …)
already took explicit `true`/`false` while Click flags (`--verbose`,
`--resume`, `--escalate-proxy`, etc.) were bare-only — inconsistent
UX. An argv preprocessor (`normalize_bool_flag_args`) collects all
`is_flag=True` option names from the click tree and rewrites
`--verbose true` → `--verbose`, `--verbose false` → (dropped, default
applies). Bare `--verbose` still works. Applied at both `cli.main()`
entry and REPL dispatch so behaviour matches everywhere.
REPL (new interactive mode — large but mostly self-contained):
interactive.py / theme.py / batch.py / commands/*.py
- Full-screen alt-buffer with banner, fixed status widget, virtual
scrollback, bottom toolbar with live credit honeycomb.
- Subprocess-per-crawl: Twisted's reactor is a process singleton and
can't be reused; running crawls in a child process lets the REPL
handle multiple consecutive crawls per session.
- Crawl + batch share a unified fixed widget that shows banner-compact
+ honeycomb progress + URL line (crawl only). No more honeycomb
rows leaking into scrollback.
- Click-to-open paths: existing paths in scrollback are underlined
brand-yellow; click opens in Finder / xdg-open / os.startfile.
Detection handles paths with spaces, `:line:col` suffixes, and
rejects URL `://path` false positives.
- `:view` pager pretty-prints JSON (existing) and HTML (new via lxml);
`r` toggles raw.
- Multi-line paste preview: bracketed paste with newlines puts the
pasted lines in a multi-line editable buffer (Up/Down navigate
lines, Ctrl+J / Alt+Enter insert newline). Enter submits all,
queueing rest via `_pending_commands`. Esc / Ctrl+C clear.
- Tab completion: single-match inline-completes (bash-style), multi-
match opens popup, ghost-text-word fallback when nothing to
complete. Right accepts the next word of the ghost suggestion;
End accepts the whole ghost suggestion.
- Ctrl+C escalation: first press sends SIGTERM (graceful), second
within 2 s sends SIGKILL — useful when Twisted is parked in a
long screenshot fetch and SIGTERM lags.
- Ctrl+R / Ctrl+S explicitly disabled (their default reverse-i-search
writes into a hidden buffer we don't render — typing went to a
black hole).
- `auth --unsafe` intercepted in REPL with a "run outside" message;
its multi-step disclaimer + masked-getpass fights our termios.
- Bee facts list audited (9 corrected — Einstein quote, honey-as-
sustenance myth, etc.) and rotation starts with a verb so quick
commands don't flash trivia.
commands/amazon.py, chatgpt.py, fast_search.py, google.py, scrape.py,
usage.py, walmart.py, youtube.py — no net behavioural change vs main.
The `LiveCreditTracker` / `MiniBeeSpinner` wrappers that were added
earlier in the REPL branch have been removed (they were dead code),
leaving only REPL-gated paths (`if is_repl_mode()` branches).
- Rename _CrawlerReactorAlreadyUsed → _CrawlerReactorAlreadyUsedError (N818) - Drop CamelCase import aliases and lowercase in-function constants flagged by N806/N813/N814 across interactive.py, cli_utils.py, theme.py - Route dynamic attribute access through getattr/setattr for twisted's reactor (callFromThread/stop), scrapy Spider._crawler, sys.stdout/err .buffer adapter install, and rich Console.file rebind so ty stops flagging unresolved-attribute / invalid-assignment - Import lxml.etree / lxml.html via importlib so ty resolves the compiled submodules - Pass loop_factory to asyncio.run via **kwargs (3.12+ signature) and install the wrapper via setattr to satisfy ty - Guard sys.__stdout__ None case and tighten _set_text null-check - Remove unused type:ignore comments and the now-unused shutil.get_ terminal_size assignment in interactive.py - Delete stale tests in test_crawl.py that referenced helpers removed by the pool-based screenshot crawl rewrite (_parse_discovery_links_only, _NON_HTML_URL_EXTENSIONS)
CI runs `ruff format --check src tests` in addition to `ruff check`; 8 files were flagged. Apply ruff format so the Lint job passes.
confirm_overwrite() called click.confirm() when the target file existed, which reads from sys.stdin directly. In REPL mode prompt_toolkit owns the TTY (full-screen / alt-buffer) and never forwards keystrokes to stdin, so the prompt blocked forever and the REPL appeared frozen. When is_repl_mode() is true, raise a UsageError telling the user to re-run with --overwrite instead of attempting to prompt.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
scrapingbeewith no subcommand) — full-screen UI with banner, fixed live status widget, virtual scrollback, click-to-open paths, multi-line paste preview, and Ctrl+C-safe command interruption.--max-pages N --screenshot-full-page truesaved only 1 PNG in v1.4.1 (verified against pristine). Now produces exactly N files. Three stacked bugs fixed —_requires_discovery_phasewas missingscreenshot_full_page/screenshot_selector;scrapy_scrapingbee's default errback crashed on binary 500 responses; the scheduler's LIFO ordering pushed save requests behind a growing pile of follow-discoveries.HTML discovery + binary saveper page (~2× credits), we discover until the pool has ≥max_pagesURLs and then batch-dispatch all saves. ~50% credit reduction on link-rich sites at--max-pages 100.--max-pages Nnow means N SAVED pages (was N total responses). Backfill on save failure so flaky 5xx errors don't silently shrink the user's budget.--flag true/falsesyntax accepted across every boolean flag for consistency with the scraping-side options (--render-js true, …). Bare--verbosestill works. Applied at both CLI entry and REPL dispatch via an argv preprocessor.commands/crawl.pyconcurrency warning now shows the underlying error (e.g.HTTP 429) so users can tell rate-limit hiccups from auth/network problems.CLI behaviour outside the REPL is otherwise preserved — see the most recent commit message for a hunk-by-hunk explanation of each non-REPL change.
Test plan
--helpexits 0 for all 19 subcommandsusage,docs,auth --show,unsafe --list,schedule --listscrape,fast-search,google,chatgpt,amazon-product,amazon-search,walmart-product,walmart-search,youtube-search,youtube-metadatascrape --input-file urls.txt --output-dir batch/--max-pages 3produces exactly 3 files--max-pages 3 --screenshot-full-page trueproduces exactly 3 PNGs (vs pristine v1.4.1 which produces 1)--save-pattern "blog"filters correctly$()/backtick/pipe/&&injection blocking, audit log captures actions,unsafe --disablere-locks the gate--verbose true,--verbose false, bare--verboseall work; same for other bool flags:help/:set/:show/:list/:view), Tab completion (single-match inline + multi-match popup + ghost-text-word fallback), shell!cmd, single API call with preview, crawl status widget (banner shrinks, URL line, honeycomb), screenshot crawl bug fix verification (exactly N PNGs), multiple consecutive crawls in one session (subprocess-per-crawl), Ctrl+C SIGTERM→SIGKILL escalation, batch progress widget unified with crawl,:viewsmart routing + HTML pretty-print, unsafe-mode--on-completehook, multi-line paste preview + edit-before-execute, click-to-open paths, exit/restart history persistence