Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

ioquatix
Copy link
Member

@ioquatix ioquatix commented Mar 9, 2023

@ioquatix ioquatix marked this pull request as draft March 9, 2023 09:48
* @param[in] name An instance of ::rb_cString.
* @retval mod
*/
VALUE rb_mod_name_set(VALUE, VALUE);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this function need to be exposed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just added it next to rb_mod_name for consistency, but we don't have to.

I don't have a strong opinion about it. I can imagine some use cases.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think better not to expose it, one can just use rb_funcall if they want to call it from C.
Also this function name does not include temporary or anything, so it seems better to stay an internal impl detail.

Copy link
Member Author

@ioquatix ioquatix Jun 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's quite messy to optionally detect something that must be done at runtime using rb_funcall, vs have_func("rb_mod_name_set"). I don't have a particular preference about the name, but I think the proposed name is consistent with the other names. However, if you have a better suggestion, I'm happy to adopt it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's quite messy to optionally detect something that must be done at runtime using rb_funcall

It's not, there is rb_respond_to() and rb_check_funcall().

I think it's obvious that rb_mod_name_set would be the (unlimited) setter and rb_mod_name the getter. But that's not case. There is no reason to expose this C API function and it has a cost.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the function is exposed in Ruby, why are you so strongly against exposing it as part of the C API? Is it because it's difficult for compatibility with TruffleRuby's C interface layer?

I'm okay to hide it for now, but I may introduce it if I need it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are you so strongly against exposing it as part of the C API?

  • I think the function name is bad/confusing, there is no need to discuss it if we don't add it
  • There is no need to add it
  • There is a cost to expose a C API function, it needs to be tested (e.g. C API tests) and reimplemented on other Ruby implementations. IMO we should not expose something which does exactly the same as rb_funcall, that's duplication (unless it makes a big difference in performance maybe, not relevant here).

Comment on lines 14 to 15
m.set_temporary_name("Foo")
m.name.should == "Foo"
Copy link
Member

@eregon eregon Jun 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm uncomfortable with this, it means Module#name is completely disconnected from the constant path (i.e. it's unrelated to constant Foo).

Can we at least prevent a temporary name starting with an uppercase letter to be able to easily distinguish these "fake" names from real constant paths? (i.e., raise if the temporary name starts with an uppercase letter)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I think we should introduce later, is Module#permanent? or Module#anonymous? or some way you can check whether a name is actually a real constant or not. I'm also okay with updating #to_s and #inspect to reflect this if that's compatible enough.

I personally think your proposed change is too restrictive. One use case for setting a "proper" name on an anonymous module is to do something like this:

module Template
  def self.[](path)
    self.new(path)
  end

  def self.new(path)
    m = Class.new
    m.set_temporary_name("#{self}[#{path.dump}]")
    # ...
    return m
  end
end

Template["foo.rb"]
# => Template["foo.rb"]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Module#permanent? or Module#anonymous?

This wouldn't help anyone e.g. seeing an error message containing that fake constant name.
But it could be useful to tell if Class#name is related to a constant path or not.

I'm also okay with updating #to_s and #inspect to reflect this if that's compatible enough.

This could work, if it also affects the name shown in exception messages (rb_mod_name() is used I think).
It should be compatible, since so far anonymous modules have anyway a anonymous-like to_s/inspect.

I personally think your proposed change is too restrictive. One use case for setting a "proper" name on an anonymous module is to do something like this:

It could be template["foo.rb"] and then it is immediately clear to the reader there is no Template constant, unlike undefined method 'foo' for #<Template["foo.rb"]:0x00007efc38711fc0> (NoMethodError).
But yeah it wouldn't work to actually eval template["foo.rb"].

We could also check if the temporary name is a valid constant name, and only forbid those.
At least that name is a not a valid constant name:

irb(main):003:0> m=Module.new
=> #<Module:0x00007fa89c26a970>
irb(main):004:0> m.const_set('Template["foo.rb"]', 42)
(irb):4:in `const_set': wrong constant name Template["foo.rb"] (NameError)

It is harder to visually notice though.

Copy link
Member Author

@ioquatix ioquatix Jun 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not really possible to check if the constant is already set or not, because someone could define it after the fact. The entire point is to avoid collisions with global state so I think it's a bad design. i.e. it shouldn't matter what name you want to give to an anonymous module.

there is no Template constant

Actually, in my example, there is.

Comment on lines 35 to 36
m::N.set_temporary_name("Foo")
m::N.name.should == "Foo"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also very strange, that module has a lexical parent (m), but its Module#name does not reflect it.
Again it's a very confusing "lie" like above.
If it's foo or <#tmp name:Foo> or so then there is no confusion with a constant path.

eregon
eregon previously requested changes Jun 18, 2023
Copy link
Member

@eregon eregon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like some changes, notably in specs, I am strongly against giving fake constant-like names in specs and the docs, it encourages misusage of the feature. I think best is to prevent that by raising if the temp name starts with an uppercase letter.

@ioquatix
Copy link
Member Author

Can you define what is "misuse of this feature" and what the practical problems are?

@ioquatix ioquatix self-assigned this Jun 19, 2023
@ioquatix ioquatix marked this pull request as ready for review June 19, 2023 00:18
@eregon
Copy link
Member

eregon commented Jun 19, 2023

Can you define what is "misuse of this feature" and what the practical problems are?

Yes, e.g. what you are doing in the specs, naming an anonymous module Foo when it has never been related to the constant path Foo.
That is IMO an abuse and misuse of this feature to confuse people into thinking there is such a constant when there is not.
If one wants such a name they should actually assign the module to that name, which is what 99.99% code does today.
Or choose a temporary name that isn't a valid constant name to be clear.

@ioquatix ioquatix dismissed eregon’s stale review June 21, 2023 07:40

All feedback addressed.

@ioquatix ioquatix merged commit a87bce8 into ruby:master Jun 21, 2023
@ioquatix ioquatix deleted the class-set-name branch June 21, 2023 07:49
@eregon
Copy link
Member

eregon commented Jun 21, 2023

Thank you, this feels much safer and much better-defined with the check.

rb_raise(rb_eArgError, "empty class/module name");
}

if (rb_is_const_name(name)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually we should reject constants paths too, because e.g. Module.new.set_temporary_name("Foo::Bar") would be bad/confusing. I'll try to find a C function doing this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't find any existing C function for that, but there is logic doing very similar things in const_get and in const_defined?. Could you create a function to check that and use it here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I can take a look.

@fxn
Copy link
Contributor

fxn commented Jul 2, 2023

Please, note that constant paths do not raise in master (maybe it's just pending).

@eregon
Copy link
Member

eregon commented Jul 3, 2023

Please, note that constant paths do not raise in master (maybe it's just pending).

Right, we should fix that ASAP before code starts relying on it.
@ioquatix Could you do it?

One idea would be to use a Regexp and define the method in Ruby, if writing the check in C proves difficult.

@fxn
Copy link
Contributor

fxn commented Jul 3, 2023

I am curious about the terminology here. To me, both A::B and a::B are constant paths. Do you guys agree?

So, what needs to trigger the error is a constant path that starts with a constant identifier. I don't know if there is a term for that.

@eregon
Copy link
Member

eregon commented Jul 3, 2023

I am curious about the terminology here. To me, both A::B and a::B are constant paths. Do you guys agree?

I would say the first is a constant path, the second is a constant lookup. I can see what you mean though.

Here we want to prevent constants paths where every component is a valid constant identifier.

@fxn
Copy link
Contributor

fxn commented Jul 3, 2023

@eregon interesting, I am drafting a little book about constants in Ruby, and questions like this arise.

The way I see it, in a constant path the first segment is an expression. To me, A and a syntactically are the same. At runtime, both A and a are expressions that evaluate to an arbitrary value. The first one needs a relative lookup, the second one is a variable. If the returned value is a class or module, you resolve B on it, otherwise, you raise.

As an implementor, you see it in a different way perhaps?

@fxn
Copy link
Contributor

fxn commented Jul 3, 2023

To me, A and a syntactically are the same.

Hmmm, that is not correctly expressed. What I mean is that both are expressions, they both start a chain of constant name resolutions to their right.

@eregon
Copy link
Member

eregon commented Jul 3, 2023

Yeah, if a = A by referential transparency they are the same.
:: is a constant lookup operator, the LHS can be anything, also (sleep 1; Thread)::Backtrace.
That doesn't look very constant path-y though, hence I think most Rubyists expect only valid constant names as components of a constant path, and I would define a constant path as such.

I think it's uncommon to use small::Foo in Ruby, so I think many Rubyists would probably be surprised by such syntax.
And of course a "permanent Module#name/classpath" has no start-with-lowercase component, so if people use that to refer to a module they won't have any start-with-lowercase component.

@fxn
Copy link
Contributor

fxn commented Jul 3, 2023

I think many Rubyists would probably be surprised by such syntax.

Maybe, but in the book I have in mind I want to cover Ruby, the language, not what is popular or most commonly known.

It is very important that you understand that a::B is a thing, because it helps you break with the illusion of types. It helps you see this is all orthogonal. The first expression gives you a value. If that value is a class or module object, then you carry on.

@fxn
Copy link
Contributor

fxn commented Jul 4, 2023

The difference between a::B and A::B is that the second one is globally available, and is the kind of thing you get as permanent name. What you do not want in a temporary name is A::B, you do not want something that resembles a permanent name.

Since a::B cannot be a permanent name, it is fine. Regardless of terminology.

Aside, I personally call both a::B and A::B constant paths, because they are paths that take you towards a constant via constant lookup. Whether the first expression is a constant identifier itself or not does not matter. A constant there is just one particular case of expression. It is not special.

@fxn
Copy link
Contributor

fxn commented Jul 4, 2023

BTW, something I have not explicitly said in this conversation is that I am thinking about the error message.

The error message cannot be "a temporary name cannot be a constant path". Because there are constant paths that can be temporary names, like ::A::B (or a::B in my working definition). The error condition is "a temporary name cannot look like a permanent name". Not proposing that wording, just illustrating my point.

@fxn
Copy link
Contributor

fxn commented Jul 4, 2023

Maybe we could side kick error message nuances and simplify implementation by disallowing anything that starts with an uppercase letter? Regardless of whether it is a well-formed permanent name or not?

@eregon
Copy link
Member

eregon commented Jul 4, 2023

Maybe we could side kick error message nuances and simplify implementation by disallowing anything that starts with an uppercase letter? Regardless of whether it is a well-formed permanent name or not?

That would be too limiting, on the issue some people want e.g. Template[path/file.rb] as a temporary name and that can actually resolve to the right thing when eval'd (e.g. useful for debugging).

@fxn
Copy link
Contributor

fxn commented Jul 4, 2023

That would be too limiting, on the issue some people want e.g. Template[path/file.rb] as a temporary name and that can actually resolve to the right thing when eval'd (e.g. useful for debugging).

Ahhh, I see it was discussed in this very thread indeed. OK!

@eregon
Copy link
Member

eregon commented Jul 6, 2023

PR to add check for constant path: #8035

@fxn
Copy link
Contributor

fxn commented Jul 6, 2023

I was about to followup, because I have discussed the terminology with Matz a bit. I am just waiting for one last question.

Expressions like A, m::X, ::A, or A::B are constant references. Just like his book with David Flanagan calls them. They can be simple or compound (whenever there is a lookup operator).

A subset of constant references are called constant paths. A constant path is a constant reference whose segments are constants. Therefore, A::B is a constant path, and m::X is not. (My remaining question is whether ::A::B is a constant path.)

Additionally, a subset of constant paths is class or module paths. This is informal, and refers to a constant path that it is additionally the canonical/permanent name of a certain class or module object. So, String is a class path, but if we do S = String, then S is not a class path.

@fxn
Copy link
Contributor

fxn commented Jul 6, 2023

Update: He does not consider ::A::B to be a constant path, it is just a constant reference.

Let me followup in the other PR.

@fxn
Copy link
Contributor

fxn commented Jul 7, 2023

Update: While we chatted about the three concepts, in the end Matz concluded this very explictly:

There's no constant path in Ruby.

He only uses two terms: constant reference, and class/module path.

matzbot pushed a commit that referenced this pull request Feb 26, 2024
This is a fix-up of #7483.

Test method names in Rubygems are often very long, and the path under
the temporary directory generated from them can easily exceed system
limits.

Even without including the method name in the path name, by saving the
method and path name pair, it is possible to find the method from the
remaining path name.

rubygems/rubygems@de55064a3d
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants