Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@cosineblast
Copy link

@cosineblast cosineblast commented Jan 12, 2024

Fixes #1596.
This implements a #pragma once directive in the C preprocessor.
This is achieved by finding the absolute of the current file (using GetFinalPathNameByHandleA on windows and POSIX 2008 realpath on other platforms) and saving it in a StringPool.

Makes so that preprocessing a pragma yields a status flag with additional information on how to handle that particular #pragma.
Mostly a prepraration for #pragma once.
@rofl0r
Copy link
Contributor

rofl0r commented Jan 13, 2024

a bit of an anti-feature, as using it makes your code no longer conformant to C, but if it is to be implemented, it would be more elegant to just add a hidden macro like __CC65_INTERNAL_HEADER_FOO_H_INCLUDED__ on first occurence of the pragma, and skip inclusion on later occurences if that macro was encountered, instead of doing path shenanigans.

@cosineblast
Copy link
Author

just add a hidden macro like __CC65_INTERNAL_HEADER_FOO_H_INCLUDED__ on first occurence of the pragma, and skip inclusion on later occurences if that macro was encountered

Hmm I don't get it, to be honest. Would the macro be added as the output of the preprocessor?
I'm aware the preprocessor emits #lines and pragmas as its output, but didn't know that emitting macros works as well, is it multipass?

@rofl0r
Copy link
Contributor

rofl0r commented Jan 13, 2024

Would the macro be added as the output of the preprocessor?

no, it would just generate the macro and add it to the list of other defined macro while passing over a sourcefile, as if the user had written

#ifndef __CC65_INTERNAL_HEADER_FOO_H_INCLUDED__ 
#define __CC65_INTERNAL_HEADER_FOO_H_INCLUDED__

instead of #pragma once and then behave accordingly, except that no #endif is needed.

@cosineblast
Copy link
Author

cosineblast commented Jan 13, 2024

I see, but I still believe we would have to find the absolute path of the file anyway (e.g to generate the macro name),
if we want users to be able to deal with header files with non unique names under different directories (e.g object/util.h and path/util.h), and also for dealing with ..s (e.g #include "foo/../thing.h" vs #include "thing.h").

@rofl0r
Copy link
Contributor

rofl0r commented Jan 13, 2024

it should be sufficient to use the basename of the file, unless there's more than one file of the same name... to support this, you indeed need to use either the complete, normalized filename to generate a unique macro, or a cryptographic hash of the file contents (assuming the preprocessor code has access to the entire file contents from top to bottom during the processing of single lines).
either way my approach is more elegant as it reuses existing facilities (hashmap of macros) instead of adding new datastructures, variables and code to handle this. apart from generating the unique macro name, the rest of the added code should be less than 10 lines. it's basically (pythonic pseudocode):

if pragma == "once":
  s = generate_unique_macro_based_on_filename(filename)
  if not s in macros:
    macros[s] = 1
  else:
    stop_preprocessing_current_file()

@cosineblast
Copy link
Author

Oh I get it now. I was a bit confused since you said "instead of doing path shenanigans", and thought the problem was getting the full path for things. I assume you meant "instead of doing strpool shenanigans" instead?

@spiro-trikaliotis
Copy link

I think if someone wants to roll out a #pragma once, the first thing to do is to define which heuristic will be used to assume it is the same file? Afterwards, this heuristic has to be implemented.

The problems with symlinks, hardlinks and hand-copied files are known, and it is not trivial to define it, because most views are valid from the one or the other perspective.

@rofl0r
Copy link
Contributor

rofl0r commented Jan 13, 2024

I was a bit confused since you said "instead of doing path shenanigans", and thought the problem was getting the full path for things. I assume you meant "instead of doing strpool shenanigans" instead?

no, i assumed taking the basename() of the file would be sufficient - i didn't think about the case of multiple files of the same basename, each using the pragma. it's generally confusing and bad practice to have several headers with the same filename (in different dirs though), but it might happen in complex projects with library code in subdirs. even then, one would assume the subprojects use proper include guards for portability instead of the microsoft-ish pragma once.

@cosineblast
Copy link
Author

I think if someone wants to roll out a #pragma once, the first thing to do is to define which heuristic will be used to assume it is the same file? Afterwards, this heuristic has to be implemented.

Alright. So, this is what I plan to implement in this PR:

  • The heuristic for determining 'same-files' is going to be the absolute, 'real' path of the file. That is, for symbolic and hard links, the absolute path of the original/target file will be considered. This behavior matches the one of GCC and clang, but not MSVC.
  • Preprocessing #pragma once in a file will cause later #includes not to include that file.
    For instance, this means that having #pragma once at the bottom of a file has effectively the same effect as a #pragma once at the beginning of a file (as long as the preprocessor reaches the end of the file the first time the file is included)
    This matches the behavior of GCC and clang, can't confirm for MSVC.

For linux, the 2008 POSIX revision of realpath will be utilized. For windows, a combination of CreateFile and GetFinalPathNameByHandle will be used instead. The flags for both will replicate the ones used in the canonicalize windows implementation of the Rust standard library

However, I am still unsure on what to do to determine whether the to-be-included file has been seen with #pragma once or not. A macro is in fact more elegant, as @rofl0r pointed out, but it has some issues with the absolute file names, since
those can include spaces and other symbols like + or emojis.

I think the most straightforward solution is to use a StringPool to store the absolute file names, and add a public function to this module, that allows one to check whether a given string is in the StringPool or not.

@groessler
Copy link
Contributor

The heuristic for determining 'same-files' is going to be the absolute, 'real' path of the file. That is, for symbolic and hard links, the absolute path of the original/target file will be considered. This behavior matches the one of GCC and clang, but not MSVC.

For hard links you also need to look at the inode and device numbers.

What problem does this PR solve?

@cosineblast
Copy link
Author

cosineblast commented Jan 13, 2024

What problem does this PR solve?

This PR does not intend to solve an existing problem in cc65, but instead to implement the #pragma once feature, present in some other C compiler implementations.

For hard links you also need to look at the inode and device numbers.

My bad. Trying to get it to behave like gcc and clang would be great, but maybe it's a bit overkill to check hard links as well? What do you think?

@cosineblast
Copy link
Author

cosineblast commented Jan 18, 2024

PR Status

This pull request currently implements #pragma once, as described in an earlier comment, except it does not consider file and filesystem IDs for hard links. I can implement it if it is desirable, but I'd like to get some feedback on what is implemented so far.

@cosineblast cosineblast marked this pull request as ready for review January 18, 2024 02:56
@cosineblast cosineblast requested a review from rofl0r January 19, 2024 01:58
@rofl0r
Copy link
Contributor

rofl0r commented Jan 20, 2024

since you ask for a review i judge by this metric: +339 −15
350 changed lines for a complete antifeature. to be fair, about a 100 of that is tests and another 20 docs (plus another 20 for silly copyright headers due to use of new files), but this is still a huge change. imo it should be max 20 lines for the pathname utility function and another 10-15 for inserting the macro as i proposed. thus i vote for "do not merge".

@cosineblast
Copy link
Author

cosineblast commented Jan 20, 2024

I appreciate your constructive criticism. Is this sentiment towards this implementation/feature shared between other collaborators?

With regard to the macro table, I've been avoiding touching it so far, because storing absolute file names in it sounds hacky to me, but if that's seem as acceptable, I'll have no problem adapting this into the pull request.

@polluks
Copy link
Contributor

polluks commented Jan 20, 2024 via email

@mrdudz
Copy link
Contributor

mrdudz commented Jan 22, 2024

OK so, i was holding back with commenting, because i wanted to hear some other opinions first.

First of all, some nitpicking on the PR itself:

  • Uz's copyright notice shouldn't be copied to new files (Uz has nothing to do with this)
  • The PR assumes the OS is either Windows, or POSIX compatible - there should be a fallback for when neither is the case (think eg AmigaOS)

Now for:

Is this sentiment towards this implementation/feature shared between other collaborators?

I have to admit that i am not a fan of this extension, nor do i see a point in using it at all. Like rofl0r said, this is a rather huge change, for something that's not even a standard feature. I have my doubts that it would save anyone as much typing as what went into this PR :) I also think the "convert into macro" solution is the better one (and you can eg encode pathes into BASE64 - or even just plain hex codes - to get rid of problems with special characters). Because of this i am also leaning towards "don't merge" right now. I'd like to hear opinions from other contributors however, like @oliverschmidt @acqn

@oliverschmidt
Copy link
Contributor

As I'm explicitly asked: My personal opinion is to not merge this PR.

@cosineblast
Copy link
Author

Well, thank you all for the time invested. I'd like to have some clarification on some things.

  1. Is pragma once even worth implementing in cc65? If the answer is no, then I ask for the #pragma once #1596 issue to be closed, or at least, reviewed.

The following questions depend on the answer of 1. being positive, for a future reference in case anybody tries to implement it.

  1. If pragma once were to be implemented, the consensus seems to be using the macro table to store the information. However I still have a question with regard to the mechanics of the pragma. Would an ideal implementation go for modifying #include so that it doesn't include files which have already been seen in #pragma once? Or would it be implemented by stopping the preprocessing when a pragma once is seen on a file that has already been marked as such (e.g two #pragma once's in a row stops the preprocessing altogether).

@oliverschmidt
Copy link
Contributor

@cosineblast: Just my two cents as former maintainer:

Yes, if you implement an open feature request any your implementation has no obvious flaws then you should be able to assume that your implementation is welcome. Period.

Why isn't it like this? Because "weird" feature requests aren't closed.

Why aren't "weird" feature requests not closed? Because closing a feature request results in lengthy discussions the maintainer wants to avoid.

Is there a way to avoid this situation in the future? My personal(!) proposal is to add a boilerplate comment to all open feature requests saying something alone the lines of "If you consider to close this feature request by actually implementing the feature requested, then please get in touch with the maintainer before investing any effort."

@cosineblast
Copy link
Author

fair enough

@mrdudz
Copy link
Contributor

mrdudz commented Jan 23, 2024

I have added a note about this in https://github.com/cc65/cc65/blob/master/Contributing.md instead (which i expect anyone to read before starting anyway)

@cosineblast
Copy link
Author

I plan on working on this PR in a near future, with the goal of changing the file verification mechanism so that it uses the macro table.

If pragma once were to be implemented, the consensus seems to be using the macro table to store the information. However I still have a question with regard to the mechanics of the pragma. Would an ideal implementation go for modifying #include so that it doesn't include files which have already been seen in #pragma once? Or would it be implemented by stopping the preprocessing when a pragma once is seen on a file that has already been marked as such (e.g two #pragma once's in a row stops the preprocessing altogether).

Before going on with things, I'd like to ask which of these options would be more suitable (or another alternative I didn't think of).

For ease of explanation, I will refer to the former approach as the #include approach, as it affects the behaviour of #include, and the latter as the #pragma #pragma aproach, as it effectively halts preprocessing once a second pragma is seen in the same file.

I was planning on going with the #include one, as it matches an existing behavior in other compilers. However, it has the minor downside that it requires modifications in OpenIncludeFile in input.c as opposed to the #pragma #pragma one, in which include code is left intact.

I was just going to keep going with the #include one, but I've learned it's best to discuss this sort of things.
(Thanks for the reality check)

@mrdudz
Copy link
Contributor

mrdudz commented Feb 9, 2024

I would certainly prefer whatever requires less changes/additions in the compiler. I cant say which it is however. I'd also prefer a solution that doesn't require OS specific hackery (but i don't think it can work without in practise).

@GorillaSapiens
Copy link

at the risk of reigniting a flame war, this seems a little complex. why not, before inclusion, scan the file for a "#pragma once". if it has one, checksum the file. if the checksum matches something else already included as a pragma once, skip the include. what's all this argument about paths and hidden macros ?

@cosineblast
Copy link
Author

The main issue is that I was trying to go for the same behaviour as mainstream compilers, which was a bit overkill for the scope of cc65

@mrdudz mrdudz mentioned this pull request Jun 23, 2025
@mrdudz mrdudz marked this pull request as draft June 23, 2025 13:22
@mrdudz mrdudz changed the title [cc65] Pragma once Pragma once Jun 26, 2025
@kugelfuhr
Copy link
Contributor

A few comments since this came up in #2743.

It is not only important to define heuristics regarding "same file" semantics. The very first thing would be to define what #pragma once should do and how it is supposed to work. I'm not aware of a formal definition of #pragma once. Which is to be expected, since it is non standard,

So, should #pragma once cover the whole file, regardless of its position in the source? Or is just the remainder of the code in this file skipped? If #pragma once covers the whole file, then how is this handled:

$ cat test.h
#ifdef FOO
#pragma once
#endif

This might be used as

$ cat test.c
#define FOO
#include "test.h"
#undef FOO
#include "test.h"

If #pragma once is not allowed within #if, should the code check for it? Or would it be reasonable to check that #pragma once is the first valid token sequence in the file? This would automatically rule out any #if around it.

If #pragma once just skips the remainder of the code, placing it into #if might cause problems. In the following snippet, the #endif is skipped, which means it is not even possible to close the #ifdef.

#ifdef FOO
#pragma once
#endif

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

#pragma once

9 participants