Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Add lakectl fs cp and lakectl fs mv commands#10083

Open
nside wants to merge 4 commits intotreeverse:masterfrom
nside:feature/lakectl-fs-cp-mv
Open

Add lakectl fs cp and lakectl fs mv commands#10083
nside wants to merge 4 commits intotreeverse:masterfrom
nside:feature/lakectl-fs-cp-mv

Conversation

@nside
Copy link

@nside nside commented Feb 2, 2026

Summary

  • Add lakectl fs cp command to copy objects within a repository
  • Add lakectl fs mv command to move objects within a repository (copy + delete)
  • Both commands support single-object and recursive operations

Test plan

  • Unit tests added for copy/move helper functions
  • Integration tests added in esti for both commands
  • Manual testing on live lakeFS instance
  • Run full CI test suite

Description

This PR adds two frequently requested commands to lakectl for managing objects within a repository:

lakectl fs cp <source URI> <dest URI>

Copies objects from source to destination within the same repository.

lakectl fs mv <source URI> <dest URI>

Moves objects (copy + delete source) within the same repository.

Flags (both commands)

Flag Description
-r, --recursive Recursively process all objects under the path
-p, --parallelism Max concurrent operations (default: 25)
--no-progress Disable progress bar

Example usage

# Copy single file
lakectl fs cp lakefs://repo/main/file.txt lakefs://repo/main/copy.txt

# Copy directory recursively
lakectl fs cp -r lakefs://repo/main/data/ lakefs://repo/main/data-backup/

# Move file
lakectl fs mv lakefs://repo/main/old.txt lakefs://repo/main/new.txt

Implementation notes

  • Uses existing CopyObject API for copies
  • Recursive operations use shared worker pool pattern (same as fs rm)
  • Move performs copy first, then batch deletes sources (chunks of 1000)
  • Both URIs must be in the same repository (API limitation)
  • Error output limited to 10 messages to avoid flooding stderr

🤖 Generated with Claude Code

@CLAassistant
Copy link

CLAassistant commented Feb 2, 2026

CLA assistant check
All committers have signed the CLA.

@github-actions github-actions bot added area/testing Improvements or additions to tests area/lakectl Issues related to lakeFS' command line interface (lakectl) labels Feb 2, 2026
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ae8cc65867

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@nside nside force-pushed the feature/lakectl-fs-cp-mv branch from ae8cc65 to ea0cb71 Compare February 2, 2026 00:51
Add two new commands to the lakectl CLI:
- `lakectl fs cp` - Copy objects within a repository
- `lakectl fs mv` - Move objects within a repository (copy + delete)

Both commands support:
- Single object and recursive (-r) operations
- Force flag (-f) to overwrite existing objects
- Configurable parallelism (-p) for concurrent operations
- Progress bar (disable with --no-progress)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
@nside nside force-pushed the feature/lakectl-fs-cp-mv branch from ea0cb71 to 35eaf09 Compare February 2, 2026 00:53
Copy link
Contributor

@arielshaqed arielshaqed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Given the number of rough edges around the copyObject API I Will leave the overall CLI design to others who are more expert.

Right now I'm concerned about:

  • Some very strange possible behaviours.
    • (Error flooding when something goes wrong...).
    • -f flag seems to do something other than documented.
  • This seems heavily AI-generated. The sheer number of lines of code will raise the maintenance burden. I would like to request more oversight; among things:
    • Unify significant part of copy and move behaviours.
    • Use atomics where necessary, avoid where unnecessary.
    • Readability of the tests - again, they are very hard to read. It took me a while to understand where to find tests for "mv". I still don't understand a method called mockCopyClient.DeleteObjectsWithResponse. Etc.

Comment on lines +134 to +135
errorsWg.Add(1)
go func() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use errorsWg.Go. Even better use errgroup.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done - switched to errorsWg.Go(func() {...}) pattern, matching the style in fs_rm.go.

go func() {
defer errorsWg.Done()
for err := range errorCh {
fmt.Fprintln(os.Stderr, "Error:", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can flood the screen, and it is poorly formatted.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed - now limits error output to 10 messages, then shows "(additional errors suppressed)".

defer errorsWg.Done()
for err := range errorCh {
fmt.Fprintln(os.Stderr, "Error:", err)
atomic.AddInt64(&errors, 1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this atomic? It will only be accessed after WaitGroup.Wait surely?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the atomic - you're right, errorCount is only modified in the single error handler goroutine and read after Wait().

{{ if .Errors }}Errors: {{ .Errors }} object(s){{ end }}
`

var fsMvCmd = &cobra.Command{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This command is implement quite similarly to the "copy" command above. Can we share code between them?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done - unified into a single recursiveCopyMove() function with a deleteSource parameter. Reduced ~170 lines of duplication.

func init() {
withRecursiveFlag(fsCpCmd, "recursively copy all objects under the specified path")
withParallelismFlag(fsCpCmd)
fsCpCmd.Flags().BoolP("force", "f", false, "overwrite existing objects at destination")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure that this is correct? AFAICS this flag ends up on the option graveler.SetOptions.Force, which has this godoc:

		// Force set to true will bypass repository read-only protection.
		Force bool

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right - the Force flag bypasses read-only protection, not "overwrite existing". Removed the -f/--force flag entirely since it doesn't match user expectations.

nside and others added 2 commits February 18, 2026 21:55
- Remove misleading -f/--force flag (was bypassing read-only protection,
  not overwriting existing objects as documented)
- Unify copy and move logic into shared recursiveCopyMove function
- Use sync.WaitGroup.Go() pattern instead of Add(1) + go func()
- Remove unnecessary atomic operations (errorCount accessed after Wait())
- Limit error output to 10 messages to avoid flooding stderr

Co-Authored-By: Claude Opus 4.5 <[email protected]>
@github-actions github-actions bot added the docs Improvements or additions to documentation label Feb 19, 2026
Copy link
Contributor

@arielshaqed arielshaqed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

I find your continued use of AI in this PR perplexing: it appears that Claude has done little to scan the existing code, and it adds huge portions of new code (e.g. a new batched deleter) while failing to address the flagged issues (e.g. using sync.WaitGroup.Go). Please manually go over the comments and over the generated code. While we obviously appreciate code contributions, as a maintainer I need to ensure that accepted code is maintainable. This is particularly important in an open-source project.

Comment on lines +100 to +101
// deleteObjectsBatch deletes objects in batches of deleteChunkSize
func deleteObjectsBatch(ctx context.Context, client apigen.ClientWithResponsesInterface, repository, branch string, paths []string) []error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already have deleteObjectWorker in fs_rm. It even supports concurrent operations. Why add another way to do it?

}

// recursiveCopyMove handles recursive copy or move operations.
// When deleteSource is true, source objects are deleted after successful copy (move behavior).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So why not call it move?

var wg sync.WaitGroup
wg.Add(parallelism)
for range parallelism {
go func() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, use wg.Go.

Comment on lines +184 to +186
copiedMu.Lock()
copiedPaths = append(copiedPaths, task.srcPath)
copiedMu.Unlock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not blocking, but curious why this uses a lock rather than a channel.

Comment on lines +208 to +211
// Skip directory markers
if strings.HasSuffix(obj.Path, uri.PathSeparator) {
continue
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand:

  1. Looking at the suffix of the path means this does not copy directory markers - empty objects which are named for the "directory", which some systems (notably Spark) seem to like.
  2. Why not look a obj.PathType?
  3. Is this a recursive copy or not? If recursive, why are you even getting these bad objects?

continue
}
// Transform path: replace source prefix with dest prefix
relPath := strings.TrimPrefix(obj.Path, srcPrefix)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use CutPrefix and check.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/lakectl Issues related to lakeFS' command line interface (lakectl) area/testing Improvements or additions to tests docs Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants