-
Notifications
You must be signed in to change notification settings - Fork 137
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cat-file: add %(objectmode)
avoid verifying submodules' OIDs
#1689
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -292,6 +292,11 @@ newline. The available atoms are: | |
`objecttype`:: | ||
The type of the object (the same as `cat-file -t` reports). | ||
|
||
`objectmode`:: | ||
If the specified object has mode information (such as a tree or | ||
index entry), the mode expressed as an octal integer. Otherwise, | ||
empty string. | ||
|
||
`objectsize`:: | ||
The size, in bytes, of the object (the same as `cat-file -s` | ||
reports). | ||
|
@@ -407,6 +412,11 @@ Note also that multiple copies of an object may be present in the object | |
database; in this case, it is undefined which copy's size or delta base | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Jeff King wrote (reply to this): On Mon, Mar 11, 2024 at 06:56:02PM +0000, Johannes Schindelin via GitGitGadget wrote:
> diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt
> index de29e6d79d9..69b50d2042f 100644
> --- a/Documentation/git-cat-file.txt
> +++ b/Documentation/git-cat-file.txt
> @@ -412,6 +412,11 @@ Note also that multiple copies of an object may be present in the object
> database; in this case, it is undefined which copy's size or delta base
> will be reported.
>
> +Submodules are handled specially in `git cat-file`, as the objects
> +corresponding to the recorded OIDs are not expected to be present in the
> +current repository. For that reason, submodules are reported as having
> +type `submodule` and mode 1600000 and all other fields are zeroed out.
I think there's an extra 0 in the mode here?
It may also be worth being more explicit about when Git knows something
is a submodule. Naively, reading the above I might think that:
git ls-tree --format='%(objectname)' HEAD | git cat-file --batch-check
would do something special with submodules. But it can't, as there's no
context carried in just the objectname. This is obvious if you are
familiar with how Git works, but I'm not sure it would be for all end
users. So we could say something along the lines of:
When `cat-file` is given a name within a tree that points to a
submodule (e.g., `HEAD:my-submodule`), ...
-Peff There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Junio C Hamano wrote (reply to this): "Johannes Schindelin via GitGitGadget" <[email protected]>
writes:
> +Submodules are handled specially in `git cat-file`, as the objects
> +corresponding to the recorded OIDs are not expected to be present in the
> +current repository. For that reason, submodules are reported as having
> +type `submodule` and mode 1600000 and all other fields are zeroed out.
While the above may not be technically wrong per-se, I am not sure
if that is the more important part of what we want to tell our
users. For example, "git ls-tree HEAD -- sha1collisiondetection"
reports "160000 commit ...object.name.... sha1collisiondetection".
Is it correct to say ...
Submodules are handled specially in `git ls-tree`, as the
objects corresponding to the recorded OIDs are not expected to
be present in the current repository.
...? I do not think so.
For the same reason, as an explanation for the reason why "git
cat-file -t :sha1collisiondetection" just reports "submodule", the
new text does not sit well.
I actually have to wonder if the new behaviour proposed by this
patch is a solution that is in search of a problem, or trying to
solve an unstated problem in a wrong way.
O=$(git rev-parse --verify :sha1collisiondetection)
git cat-file -t "$O"
should fail because the object whose name is $O is not available.
Why should then this succeed and give a different result?
git cat-file -t :sha1collisiondetection
The "cat-file" command is about objects. While object's type may
sometimes be inferrable (by being contained in a tree), if the user
asks us to determine the type of the object, we should actually hit
the object store, whether the commit object in question happens to
be on our history or somebody else's history that our gitlink points
at.
So, I am not yet convinced that I should take this patch. Previous
two steps looked good, though.
Thanks.
> index 73bd78c0b63..c59ad682d1f 100644
> --- a/builtin/cat-file.c
> +++ b/builtin/cat-file.c
> @@ -128,7 +128,9 @@ static int cat_one_file(int opt, const char *exp_type, const char *obj_name,
> switch (opt) {
> case 't':
> oi.type_name = &sb;
> - if (oid_object_info_extended(the_repository, &oid, &oi, flags) < 0)
> + if (obj_context.mode == S_IFGITLINK)
> + strbuf_addstr(&sb, "submodule");
> + else if (oid_object_info_extended(the_repository, &oid, &oi, flags) < 0)
> die("git cat-file: could not get object info"); There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Jeff King wrote (reply to this): On Tue, Mar 12, 2024 at 11:35:16AM -0700, Junio C Hamano wrote:
> I actually have to wonder if the new behaviour proposed by this
> patch is a solution that is in search of a problem, or trying to
> solve an unstated problem in a wrong way.
>
> O=$(git rev-parse --verify :sha1collisiondetection)
> git cat-file -t "$O"
>
> should fail because the object whose name is $O is not available.
> Why should then this succeed and give a different result?
>
> git cat-file -t :sha1collisiondetection
>
> The "cat-file" command is about objects. While object's type may
> sometimes be inferrable (by being contained in a tree), if the user
> asks us to determine the type of the object, we should actually hit
> the object store, whether the commit object in question happens to
> be on our history or somebody else's history that our gitlink points
> at.
>
> So, I am not yet convinced that I should take this patch. Previous
> two steps looked good, though.
I'm not sure about "-t" in particular, but for batch output, I think if
we stop at patch 2 it would be impossible to tell the difference between
a submodule entry and a corrupt repo (or bad request). E.g., if I do
this:
(echo HEAD:Makefile; echo HEAD:sha1collisiondetection) |
git cat-file --batch-check='%(objectname) %(objectmode)'
after only patch 2, I'd get:
4e255c81f22386389c7460d8f5e59426673b5a5a 100644
HEAD:sha1collisiondetection missing
We can't tell if HEAD didn't resolve, or it doesn't have that path, or
if it's a regular blob entry and the repository is corrupt. Whereas
after patch 3, we get:
4e255c81f22386389c7460d8f5e59426673b5a5a 100644
855827c583bc30645ba427885caa40c5b81764d2 160000
and the mode tells us that we resolved it to a submodule.
The current behavior is not too surprising for cat-file, since it's
whole purpose is to give you information about the objects themselves,
and we don't have one here. But with this %(objectmode) format, we're
really moving into a realm of "resolve this name for me and show me the
context". We don't care about the details of the object at all!
I think you could make an argument that the problem is shoe-horning new,
slightly-mismatched functionality into cat-file. But there are lots of
practical reasons to want to do so, as we discussed elsewhere. Since
gitlinks are the only place where we'd expect an object to be missing,
"simulating" them here isn't too bad. But I suspect there's a more
general solution where cat-file learns to print dummy values for any
missing object, letting the caller see what we _could_ find out. And
then the submodule case just falls out naturally. I doubt we could make
it the default for historical compatibility; we'd need a new option.
This is all speculative on my part, of course. Probably Dscho or
Victoria can explain their use case better. :)
-Peff There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Junio C Hamano wrote (reply to this): Jeff King <[email protected]> writes:
> I think you could make an argument that the problem is
> shoe-horning new, slightly-mismatched functionality into
> cat-file. But there are lots of practical reasons to want to do
> so, as we discussed elsewhere. Since gitlinks are the only place
> where we'd expect an object to be missing, "simulating" them here
> isn't too bad.
100% agreed. This is something we should be asking about "HEAD:"
tree object, not about "HEAD:sha1collisiondetection" object, if we
are to ask cat-file. After all "cat-file p HEAD:" tells us that the
thing is a submodule already. But unfortunately the "--batch" thing
is limited to "give me an object and what you want to know about the
object, and I'll tell you what I know about it" exchange, so it is a
very bad match when you cannot really give it an object (which you
do not have, like the target of the gitlink). So...
> But I suspect there's a more
> general solution where cat-file learns to print dummy values for any
> missing object, letting the caller see what we _could_ find out. And
> then the submodule case just falls out naturally. I doubt we could make
> it the default for historical compatibility; we'd need a new option.
... "--batch" obviously needs to be extended, and %(objectmode) may
be one direction to do so, but it would also work to allow us to ask
about "HEAD:" and what it has at paths, which match a pathspec
"sha1collisiondetection", an equivalent to give "cat-file --batch" a
command to drive "ls-tree".
> This is all speculative on my part, of course. Probably Dscho or
> Victoria can explain their use case better. :)
Likewise. |
||
will be reported. | ||
|
||
Submodules are handled specially in `git cat-file`, as the objects | ||
corresponding to the recorded OIDs are not expected to be present in the | ||
current repository. For that reason, submodules are reported as having | ||
type `submodule` and mode 1600000 and all other fields are zeroed out. | ||
|
||
GIT | ||
--- | ||
Part of the linkgit:git[1] suite |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -112,65 +112,67 @@ strlen () { | |
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Junio C Hamano wrote (reply to this): "Victoria Dye via GitGitGadget" <[email protected]> writes:
> From: Victoria Dye <[email protected]>
>
> Update the 'run_tests' test wrapper so that the first argument may refer to
> any specifier that uniquely identifies an object (e.g. a ref name,
> '<OID>:<path>', '<OID>^{<type>}', etc.), rather than only a full object ID.
> Also, add a test that uses a non-OID identifier, ensuring appropriate
> parsing in 'cat-file'.
>
> Signed-off-by: Victoria Dye <[email protected]>
> Signed-off-by: Johannes Schindelin <[email protected]>
> ---
> t/t1006-cat-file.sh | 46 +++++++++++++++++++++++----------------------
> 1 file changed, 24 insertions(+), 22 deletions(-)
>
> diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh
> index e0c6482797e..ac1f754ee32 100755
> --- a/t/t1006-cat-file.sh
> +++ b/t/t1006-cat-file.sh
> @@ -112,65 +112,66 @@ strlen () {
>
> run_tests () {
> type=$1
> - sha1=$2
> + object_name=$2
> + oid=$(git rev-parse --verify $object_name)
> size=$3
> content=$4
> pretty_content=$5
>
> - batch_output="$sha1 $type $size
> + batch_output="$oid $type $size
> $content"
As "object_name" is now allowed to be any name in the 'extended
SHA-1' syntax (cf. Documentation/revisions.txt), you should be a bit
more careful in quoting.
oid=$(git rev-parse --verify "$object_name")
> test_expect_success "$type exists" '
> - git cat-file -e $sha1
> + git cat-file -e $object_name
> '
Likewise. You may not currently use a path with SP in it to name a
tree object, e.g., "HEAD:Read Me.txt", but protecting against such a
pathname is a cheap investment for futureproofing.
Looking good otherwise. Thanks. |
||
run_tests () { | ||
type=$1 | ||
sha1=$2 | ||
size=$3 | ||
content=$4 | ||
pretty_content=$5 | ||
|
||
batch_output="$sha1 $type $size | ||
object_name=$2 | ||
oid=$(git rev-parse --verify $object_name) | ||
mode=$3 | ||
size=$4 | ||
content=$5 | ||
pretty_content=$6 | ||
|
||
batch_output="$oid $type $size | ||
$content" | ||
|
||
test_expect_success "$type exists" ' | ||
git cat-file -e $sha1 | ||
git cat-file -e $object_name | ||
' | ||
|
||
test_expect_success "Type of $type is correct" ' | ||
echo $type >expect && | ||
git cat-file -t $sha1 >actual && | ||
git cat-file -t $object_name >actual && | ||
test_cmp expect actual | ||
' | ||
|
||
test_expect_success "Size of $type is correct" ' | ||
echo $size >expect && | ||
git cat-file -s $sha1 >actual && | ||
git cat-file -s $object_name >actual && | ||
test_cmp expect actual | ||
' | ||
|
||
test_expect_success "Type of $type is correct using --allow-unknown-type" ' | ||
echo $type >expect && | ||
git cat-file -t --allow-unknown-type $sha1 >actual && | ||
git cat-file -t --allow-unknown-type $object_name >actual && | ||
test_cmp expect actual | ||
' | ||
|
||
test_expect_success "Size of $type is correct using --allow-unknown-type" ' | ||
echo $size >expect && | ||
git cat-file -s --allow-unknown-type $sha1 >actual && | ||
git cat-file -s --allow-unknown-type $object_name >actual && | ||
test_cmp expect actual | ||
' | ||
|
||
test -z "$content" || | ||
test_expect_success "Content of $type is correct" ' | ||
echo_without_newline "$content" >expect && | ||
git cat-file $type $sha1 >actual && | ||
git cat-file $type $object_name >actual && | ||
test_cmp expect actual | ||
' | ||
|
||
test_expect_success "Pretty content of $type is correct" ' | ||
echo_without_newline "$pretty_content" >expect && | ||
git cat-file -p $sha1 >actual && | ||
git cat-file -p $object_name >actual && | ||
test_cmp expect actual | ||
' | ||
|
||
test -z "$content" || | ||
test_expect_success "--batch output of $type is correct" ' | ||
echo "$batch_output" >expect && | ||
echo $sha1 | git cat-file --batch >actual && | ||
echo $object_name | git cat-file --batch >actual && | ||
test_cmp expect actual | ||
' | ||
|
||
test_expect_success "--batch-check output of $type is correct" ' | ||
echo "$sha1 $type $size" >expect && | ||
echo_without_newline $sha1 | git cat-file --batch-check >actual && | ||
echo "$oid $type $size" >expect && | ||
echo_without_newline $object_name | git cat-file --batch-check >actual && | ||
test_cmp expect actual | ||
' | ||
|
||
|
@@ -179,44 +181,50 @@ $content" | |
test -z "$content" || | ||
test_expect_success "--batch-command $opt output of $type content is correct" ' | ||
echo "$batch_output" >expect && | ||
test_write_lines "contents $sha1" | git cat-file --batch-command $opt >actual && | ||
test_write_lines "contents $object_name" | git cat-file --batch-command $opt >actual && | ||
test_cmp expect actual | ||
' | ||
|
||
test_expect_success "--batch-command $opt output of $type info is correct" ' | ||
echo "$sha1 $type $size" >expect && | ||
test_write_lines "info $sha1" | | ||
echo "$oid $type $size" >expect && | ||
test_write_lines "info $object_name" | | ||
git cat-file --batch-command $opt >actual && | ||
test_cmp expect actual | ||
' | ||
done | ||
|
||
test_expect_success "custom --batch-check format" ' | ||
echo "$type $sha1" >expect && | ||
echo $sha1 | git cat-file --batch-check="%(objecttype) %(objectname)" >actual && | ||
echo "$type $oid" >expect && | ||
echo $object_name | git cat-file --batch-check="%(objecttype) %(objectname)" >actual && | ||
test_cmp expect actual | ||
' | ||
|
||
test_expect_success "custom --batch-command format" ' | ||
echo "$type $sha1" >expect && | ||
echo "info $sha1" | git cat-file --batch-command="%(objecttype) %(objectname)" >actual && | ||
echo "$type $oid" >expect && | ||
echo "info $object_name" | git cat-file --batch-command="%(objecttype) %(objectname)" >actual && | ||
test_cmp expect actual | ||
' | ||
|
||
test_expect_success '--batch-check with %(rest)' ' | ||
echo "$type this is some extra content" >expect && | ||
echo "$sha1 this is some extra content" | | ||
echo "$object_name this is some extra content" | | ||
git cat-file --batch-check="%(objecttype) %(rest)" >actual && | ||
test_cmp expect actual | ||
' | ||
|
||
test_expect_success '--batch-check with %(objectmode)' ' | ||
echo "$mode $oid" >expect && | ||
echo $object_name | git cat-file --batch-check="%(objectmode) %(objectname)" >actual && | ||
test_cmp expect actual | ||
' | ||
|
||
test -z "$content" || | ||
test_expect_success "--batch without type ($type)" ' | ||
{ | ||
echo "$size" && | ||
echo "$content" | ||
} >expect && | ||
echo $sha1 | git cat-file --batch="%(objectsize)" >actual && | ||
echo $object_name | git cat-file --batch="%(objectsize)" >actual && | ||
test_cmp expect actual | ||
' | ||
|
||
|
@@ -226,7 +234,7 @@ $content" | |
echo "$type" && | ||
echo "$content" | ||
} >expect && | ||
echo $sha1 | git cat-file --batch="%(objecttype)" >actual && | ||
echo $object_name | git cat-file --batch="%(objecttype)" >actual && | ||
test_cmp expect actual | ||
' | ||
} | ||
|
@@ -240,7 +248,7 @@ test_expect_success "setup" ' | |
git update-index --add hello | ||
' | ||
|
||
run_tests 'blob' $hello_sha1 $hello_size "$hello_content" "$hello_content" | ||
run_tests 'blob' $hello_sha1 "" $hello_size "$hello_content" "$hello_content" | ||
|
||
test_expect_success '--batch-command --buffer with flush for blob info' ' | ||
echo "$hello_sha1 blob $hello_size" >expect && | ||
|
@@ -270,7 +278,8 @@ tree_sha1=$(git write-tree) | |
tree_size=$(($(test_oid rawsz) + 13)) | ||
tree_pretty_content="100644 blob $hello_sha1 hello${LF}" | ||
|
||
run_tests 'tree' $tree_sha1 $tree_size "" "$tree_pretty_content" | ||
run_tests 'tree' $tree_sha1 "" $tree_size "" "$tree_pretty_content" | ||
run_tests 'blob' "$tree_sha1:hello" "100644" $hello_size "" "$hello_content" | ||
|
||
commit_message="Initial commit" | ||
commit_sha1=$(echo_without_newline "$commit_message" | git commit-tree $tree_sha1) | ||
|
@@ -281,7 +290,7 @@ committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE | |
|
||
$commit_message" | ||
|
||
run_tests 'commit' $commit_sha1 $commit_size "$commit_content" "$commit_content" | ||
run_tests 'commit' $commit_sha1 "" $commit_size "$commit_content" "$commit_content" | ||
|
||
tag_header_without_timestamp="object $hello_sha1 | ||
type blob | ||
|
@@ -295,7 +304,7 @@ $tag_description" | |
tag_sha1=$(echo_without_newline "$tag_content" | git hash-object -t tag --stdin -w) | ||
tag_size=$(strlen "$tag_content") | ||
|
||
run_tests 'tag' $tag_sha1 $tag_size "$tag_content" "$tag_content" | ||
run_tests 'tag' $tag_sha1 "" $tag_size "$tag_content" "$tag_content" | ||
|
||
test_expect_success "Reach a blob from a tag pointing to it" ' | ||
echo_without_newline "$hello_content" >expect && | ||
|
@@ -1169,6 +1178,16 @@ test_expect_success 'cat-file --batch-check respects replace objects' ' | |
test_cmp expect actual | ||
' | ||
|
||
test_expect_success 'batch-command with a submodule' ' | ||
printf "160000 commit %0.*d\tsub\n" $(test_oid hexsz) 17 >tree-with-sub && | ||
tree=$(git mktree <tree-with-sub) && | ||
git cat-file --batch-check >actual <<-EOF && | ||
$tree:sub | ||
EOF | ||
printf "%0.*d submodule 0\n" $(test_oid hexsz) 17 >expect && | ||
test_cmp expect actual | ||
' | ||
|
||
# Pull the entry for object with oid "$1" out of the output of | ||
# "cat-file --batch", including its object content (which requires | ||
# parsing and reading a set amount of bytes, hence perl). | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Junio C Hamano wrote (reply to this):
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Junio C Hamano wrote (reply to this):