diff --git a/plan.md b/plan.md new file mode 100644 index 0000000..b183866 --- /dev/null +++ b/plan.md @@ -0,0 +1,101 @@ +# Implementation Plan for Collect CLI Tool + +## Overview +A CLI tool that collects files recursively matching specific criteria, maintains their file structure, and archives them for backup purposes. + +## Language Choice: Go +- Single binary distribution +- Built-in archive support (tar/gzip and zip) +- Excellent file system traversal capabilities +- Cross-platform compatibility +- No runtime dependencies + +## Behavior Specifications + +### CLI Interface +``` +collect [--name | --match ] +``` + +### Flags +- `--name`: Match exact filename (e.g., `--name .mise.toml`) +- `--match`: Match directory pattern, collect all files within (e.g., `--match aet-*/`) +- Flags are mutually exclusive + +### File Collection +- **Name matching**: Find all files with exact filename match anywhere in tree +- **Pattern matching**: Find directories matching glob pattern, then collect all files recursively within those directories +- Symlinks are ignored +- Preserve relative paths from source directory in archive + +### Archive Format +- Determined by output file extension +- Supported: `.tar.gz`, `.tgz`, `.zip` + +### Error Handling +- Permission errors: Log warning to stderr, skip file, continue +- No files found: Exit with code 1 and error message +- Archive creation failure: Exit with code 2 +- Invalid arguments: Exit with code 3 +- All errors reported to stderr + +## Architecture + +### Package Structure +``` +collect/ +├── main.go # CLI entry point and argument parsing +├── collector/ +│ ├── collector.go # Core collection logic +│ └── matcher.go # File/directory matching logic +├── archiver/ +│ ├── archiver.go # Archive interface +│ ├── tar.go # Tar/gzip implementation +│ └── zip.go # Zip implementation +└── go.mod # Go module definition +``` + +### Core Interfaces + +#### Collector +```go +type Collector interface { + Collect(sourceDir string) ([]FileEntry, error) +} + +type FileEntry struct { + Path string // Relative path from sourceDir + FullPath string // Absolute path for reading +} +``` + +#### Matcher +```go +type Matcher interface { + ShouldInclude(path string, info os.FileInfo) bool +} +``` + +#### Archiver +```go +type Archiver interface { + Create(outputPath string, files []FileEntry) error +} +``` + +### Implementation Flow +1. Parse and validate CLI arguments +2. Create appropriate matcher (NameMatcher or PatternMatcher) +3. Create collector with matcher +4. Walk directory tree and collect matching files +5. Check if any files were found (error if empty) +6. Create appropriate archiver based on file extension +7. Build archive with collected files +8. Report any errors to stderr + +### Key Decisions +- Use `filepath.WalkDir` for efficient directory traversal +- Clean all paths to ensure consistent relative paths +- Use buffered I/O for archive operations +- Pattern matching uses Go's `filepath.Match` function +- Archive paths are relative to the search directory root \ No newline at end of file