samcraw

Recursive web crawler that downloads files from a website starting at a given URL path.

Features

Recursive crawling with configurable depth limit
Concurrent downloads
Never follows links outside the given domain
Never navigates backwards in the URL path
Skips files that already exist locally with the same size (via HEAD + Content-Length)
Configurable delay between requests, User-Agent, and concurrency

Supported file types

.zip .tar.gz .tar.bz2 .tar.xz .tgz .rar .7z .pdf .tap .z80 .bin .dsk

Requirements

Go 1.26+
golangci-lint (for linting only)

Build

make build

The binary is placed in build/samcraw.

Usage

samcraw -url <URL> [options]

Options

Flag	Default	Description
`-url`	(required)	Starting URL to crawl
`-output`	`./downloads`	Output directory for downloaded files
`-depth`	`10`	Maximum crawl recursion depth
`-concurrency`	`3`	Number of simultaneous downloads
`-delay`	`500ms`	Delay between HTTP requests
`-user-agent`	`samcraw/1.0.0`	User-Agent header for HTTP requests
`-version`		Show version and exit

Examples

# Basic usage
samcraw -url https://example.com/files/

# Custom output directory and higher concurrency
samcraw -url https://example.com/files/ -output ./my-files -concurrency 5

# Limit depth and add delay
samcraw -url https://example.com/files/ -depth 3 -delay 1s

Development

# Run tests
make test

# Format code
make format

# Lint (staticcheck, govulncheck, golangci-lint)
make lint

# Build and run with example options
make run

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
config		config
crawler		crawler
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
cov.out		cov.out
coverage.out		coverage.out
go.mod		go.mod
go.sum		go.sum
main.go		main.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

samcraw

Features

Supported file types

Requirements

Build

Usage

Options

Examples

Development

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

samcraw

Features

Supported file types

Requirements

Build

Usage

Options

Examples

Development

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages